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FOREWORD 


The  21st  Conference  on  the  Design  of  Experiments  in  Army  Research, 
Development  and  Testing  was  held  22-24  October  1975  in  Washington,  DC. 

The  Conference,  which  took  place  at  the  Walter  Reed  Medical  Complex,  had 
two  hosts:  the  Walter  Reed  Army  Medical  Center  and  the  Armed  Forces 
Institute  of  Pathology.  Both  hosts  furnished  excellent  conference  rooms 
and  meeting  rooms  for  this  symposium.  Planning  for  these  meetings  re¬ 
quires  painstaking  attention  to  detail  and  we  are  indebted  to  Dr.  Walter 
D,  Foster  and  Dr.  James  N.  Young,  both  of  the  Armed  Forces  Institute  of 
Pathology,  for  serving  well  as  Chairmen  for  Local  Arrangments.  We  are 
pleased  that  Major  General  Robert  Bernstein,  Commander  of  the  Walter  Reed 
Army  Medical  Center,  opened  the  Conference  and  welcomed  us.  This  is  not 
the  first  meeting  to  be  held  at  the  Walter  Reed  installation.  On  each 
occasion,  the  reception  given  us  has  been  excellent,  and  we  look  forward 
to  meetings  there  again  in  the  future. 

There  were  four  addresses  by  invited  speakers.  Traditionally  an 
attempt  is  made  by  the  Program  Committee  to  have  expository  talks  on 
themes  somewhat  pertinent  to  the  mission  of  the  Army  installation  at 
which  the  annual  conference  is  held.  Success  along  these  lines  was 
achieved  again.  The  first  address  was  given  by  Frederick  Mosteller  of 
Harvard  University,  who  spoke  on  "Success  in  Social  and  Medical  Experi¬ 
mentation."  Dr.  Mosteller  was  given,  at  his  request,  two  hours  to  de¬ 
liver  his  address.  Normally,  there  would  have  been  five  invited  addresses, 
but  the  length  of  Professor  Mosteller's  talk  led  to  four  at  this  meeting. 
Dr.  Mosteller's  talk  was  given  at  the  first  morning  of  the  Conference 
and  was  followed  in  the  late  afternoon  by  two  papers  on  clinical  trials. 
There  has  been  much  in  the  medical  and  statistical  literature  on  this 
topic.  Professor  Edmund  A.  Gehan  of  the  University  of  Texas  System 
Cancer  Center  spoke  on  "Non-randomized  Clinical  Trials"  and  Professor 
Paul  Meier  of  the  University  of  Chicago  addressed  the  audience  on 
"Randomized  Clinical  Trials."  On  the  second  day  of  the  Conference, 
Professor  Seymour  Geisser  of  the  University  of  Minnesota  gave  an  in¬ 
vited  address  on  "Predictive  Sample  Reuse."  This  was  followed  on  the 
morning  of  the  last  day  of  the  Conference  by  a  talk  on  "Normality  and 
Disease"  given  by  Professor  Edmond  A.  Murphy  of  the  Johns  Hopkins 
Medical  School. 

One  major  purpose  of  the  Conference  is  to  bring  together  those 
engaged  in  scientific  work  in  Army  installations  with  investigators 
from  other  government  agencies  and  those  from  university  life.  This 
interaction  has  been  going  on  successfully  since  the  inception  of  the 
program.  Statisticians  and  others  in  Army  installations  discuss  their 
work  at  technical  sessions  and  clinical  sessions  at  each  annual  con¬ 
ference.  For  this  Conference  there  were  seven  technical  sessions  com¬ 
prising  24  papers  and  four  clinical  sessions.  At  the  clinical  sessions 
a  panel  of  experts  responds  to  problems  raised  by  those  in  Army  instal¬ 
lations  who  have  usually  given  advance  manuscript  copies  to  the  panelists. 


Besides  the  technical  aspects,  these  sessions  provide  a  source  for 
initiating  future  collaboration  between  scientists  in  Army  installa¬ 
tions  and  those  in  university  life. 

At  the  start  of  this  year's  opening  session.  Dr.  Walter  D.  Foster 
was  honored  with  a  Certificate  for  Achievement  for  the  valuable  con¬ 
tributions  he  made  during  his  twelve  years  as  Chairman  of  the  Probability 
and  Statistics  Subcommittee  of  the  Army  Mathematics  Steering  Committee. 

He  was  specifically  cited  for  "continuously  and  vigorously  crusading 
for  application  of  sound  statistical  principles  and  methodology  to 
problems  in  Army  research  and  development." 

On  the  evening  of  the  first  day  of  the  Conference,  a  banquet  is 
held  at  which  the  Samuel  S.  Wilks  Memorial  Award  of  the  American 
Statistical  Association  and  the  Department  of  the  Army  is  presented. 

At  this  meeting  the  11th  award  was  presented  by  Lester  Frankel ,  Presi¬ 
dent  of  the  ASA,  to  Dr.  Herbert  Solomon,  Professor  of  Statistics,  Stan¬ 
ford  University.  The  award  was  made  to  Dr.  Solomon  for  his  significant 
contributions  to  statistical  methodology  and  for  his  outstanding  contri¬ 
butions  in  the  application  of  statistics  in  the  service  of  the  nation. 

The  Army  Mathematics  Steering  Committee  sponsors  these  meetings  on 
behalf  of  the  Office  of  the  Chief  of  Research  and  Development  and  Ac¬ 
quisition  to  bring  new  developments  in  statistics  to  Army  scientists 
and  engineers  and  to  expose  them  to  thinking  that  could  be  profitable 
to  them  in  the  execution  of  their  missions.  The  Committee  has  asked 
that  the  proceedings  of  the  Conference  be  published  and  issued  Army¬ 
wide  and  to  other  scientific  communities. 

At  the  beginning  of  each  calendar  year  the  program  committee  for 
these  conferences  is  selected  and  meets  in  Washington,  DC,  to  suggest 
areas  of  interest,  to  outline  a  program,  and  to  suggest  speakers  for 
the  meeting  to  be  held  later  that  year.  I  would  like  to  express  my 
appreciation  to  Dr.  Frank  Grubbs,  Program  Chairman  for  this  year's 
committee,  and  to  Dr.  Douglas  Tang,  Chairman  of  the  Subcommittee  on 
Probability  and  Statistics,  Army  Mathematics  Steering  Committee,  for 
their  efforts  and  great  help.  My  thanks  also  go  to  other  committee 
members  involved  in  developing  this  year's  program:  Drs.  David  W. 

Ailing,  Gary  A.  Chase,  Walter  D.  Foster,  Bernard  Harris,  J.  Stuart 
Hunter,  Clifford  J.  Maloney,  Badri g  Kurkji an,  Marvin  Schneiderman. 

Francis  Dressel,  as  always,  was  helpful  in  many  ways  in  making  sure 
the  program  was  a  success.  Thus  many  hands  helped  in  guiding  this 
Conference  to  a  successful  conclusion,  and  this  is  very  much  appre¬ 
ciated. 


Herbert  Solomon 
Conference  Chairman 
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AGENDA 


THE  TWENTY-FIRST  CONFERENCE  ON  THE  DESIGN  OF  EXPERIMENTS 
IN  ARMY  RESEARCH,  DEVELOPMENT  AND  TESTING 

22-24  Octoter  1975 

The  Armed  Forces  Institute  of  Pathology 
««««»  Wednesday,  22  October 

0830-0930  REGISTRATION  —  Lobby  of  Sternberg  Auditorium  (WRAIR) 
0930-1220  GENERAL  SESSION  I  —  Sternberg  Auditoriiam 
CALLING  OF  CONFERENCE  TO  ORDER 

Dr.  Walter  D.  Foster,  Chairman  on  Local  Arrangements,  Armed 
Forces' Institute  of  Pathology,  Washington,  D.  C. 

WELCOMING  REMARKS 

Major  General  Robert  Berstein-.  Commander,  WRAMC 
CHAIRMAN  OF  SESSION  I 

Dr.  Frank  E.  Grubbs,  Program  Committee  Chairman,  Aberdeen 
Proving  Ground,  Maryland 

SUCCESS  IN  SOCIAL  AND  MEDICAL  EXPERIMENTATION 

Professor  Frederick  ^steller.  Department  of  Statistics, 
Harvard  University,  Cambridge,  Massachusetts 

1050-1120  BREAK 

1120-1220  GENERAL  SESSION  I  (CONTINUED)  . 

SECOND  PART  OF  THE  ADDRESS  BY  PROFESSOR  MOSTELLER 
1220-1320  LUNCH  ~  Officers'  Open  Mess,  WRAMC 
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1320-12*30  CLINICAL  SESSION  A  ~  Dart  Auditorium  (AFIP) 

CHAIEMAN 

Boyd  Harshbarger,  Department  of  Statistics,  Virginia  Polytechnic 
Institute  and  State  University,  Blacksburg,  Virginia 

PANELISTS 

Robert  Bechhofer,  Department  of  Operations  Research,  Cornell 
University,  Ithaca,  New*  York 
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Bureau  of  Standards,  Wahsington,  D.  C. 

A.  Clifford  Cohen,  Institute  of  Statistics,  University  of 
Georgia,  Athens,  Georgia 

J.  Richard  Moore,  U.S.  Army  Ballistics  Research  Laboratories 
Aberdeen  Proving  Ground,  Maryland 

INVESTIGATIONS  OF  INTERFACE  BETWEEN  5.56MM  BULLETS  AND  RIFLING 
CONFIGURATIONS 

Dennis  Conway,  Ignitions  Development  and  Engineering  Directorate, 
Frankford  Arsenal,  Philadelphia,  Pennsylvania 

A  STEP  TOWARD  THE  RATIONAL  DESIGN  OF  EXPERIMENTS  IN  METAL¬ 
FORMING  TECHNOLOGY 

Paul  Gordon,  Materials  Engineering  Division,  Pitman-Dunn 
Laboratory,  Frankford  Arsenal,  Philadelphia,  Pennsylvania 

1320-12*30  TECHNICAL  SESSION  1  —  Owen  Conference  Room  (AFIP) 

CHAIRMAN 

Lang  Withers,  U.S.  Army  Operational  Test  and  Evaluation  Agency, 
Fort  Bel voir,  Virginia 

DESIC2J  OF.  EXPERIMENTS  DEALING  WITH  MAN-MACHINE  INTERFACE  IN 
CURRENT  COMMUNICATIONS  SYSTEMS 

R.  J.  D’Accardi,  H.  S.  Bennett,  U.  S.  Army  Electronics  Command, 
Fort  Monmouth,  New  Jersey 

J.  R.  Hennessy,  U.S.  ARMY  MERDC,  Fort  Belvoir,  Virginia 

PLANNING  FOR  THE  MEASUREMENT  OF  FLIGHT  TRAJECTORY 

J.  B,  Gose  and  J.  V.  Carrillo,  Quality  Assurance  Office 
U.S.  Army  White  Sands  Missile  Range,  White  Sands,  New  Mexico 
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II13O-I5OO  BREAK 

1500-1710  GENERAL  SESSION  II  ~  Sternberg  Auditorium  (WRAIR) 

CHAIRMAN 

Dr,  Marvin  A.  Schneiderman ,  National  Cancer  Institute,  Bethesda, 
Maryland 

NONRANDOMIZED  CLINICAL  TRIALS 

Professor  Edmund  A,  Gehan,  Department  of  Biomathematics, 

University  of  Texas  System  Cancer  Center,  Houston,  Texas 

RANDOMIZED  CLINICAL  TRIALS 

Professor  Paul  Meier,  Department  of  Statistics,  The  University 
of  Chicago,  Chicago,  Illinois 

1830-1915  SOCIAL  GATHERING  ~  Officers’  Open  Mess,  WRAMC 
1915-  BANQUET 

PRESENTATION  OF  THE  SAMUEL  S,  WILKS  MEMORIAL  AWARD 
Dr.  Frank  E.  Grubbs,  Master  of  Ceremonies 
*****  Thursday,  23  October  ***** 

0830-1010  CLINICAL  SESSION  B  ~  Dart  Auditorium  (AFIP) 

CHAIRMAN 

Badrig  Kurkjian,  U,S.  Army  Materiel  Command,  Alexandria,  Virginia 
PANELISTS 

Robert  Bechhofer,  Department  of  Operations  Research,  Cornell 
University,  Ithaca,  New  York 

Seymour  Geisser,  School  of  Statistics,  University  of  Minnesota, 
Minneapolis ,  Minnesota 

J,  Richard  Moore,  U.S.  Army  Ballistics  Research  Laboratories, 
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Richard  L,  Moore,  U.S,  Army  Armament  Command,  Rock  Island,  Illinois 
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CLINICAL  SESSION  B  (CONTINUED) 

EMPIRICAL  COMPARISON  OF  CRITERION-REFERENCED  MEASUREMENT 
MODELS 

Frederick  H.  Steinheiser,  Jr.  and  Kenneth  I,  Epstein,  U.S.  Army 
Research  Institute  for  the  Behavioral  and  Social  Sciences, 
Arlington,  Virginia 

PRESSURE  IMPULSE  METHODOLOGY 

Barry  H.  Rodin,  Concepts  Analysis  Laboratory,  Ballistic 
Research  Laboratory,  Aberdeen  Proving  Ground,  Maryland 

0830-1010  TECHNICAL  SESSION  2  ~  Ow^en  Conference  Room  (aFIP) 

CHAIRMAN 

Douglas  B.  Tang,  Department  of  Biostatistics /Applied  Mathematics, 
Division  of  Biometrics  and  Medical  Information  Processing,  Walter 
Reed  Army  Institute  of  Research,  Washington,  D.  C. 

NONRANDOMIZED  FACTORIAL  DESIGNS  CHARACTERIZED  BY  TREND 
ELIMINATION  AND  A  MINIMUM  NUMBER  OF  FACTOR  LEVEL  CHANGES 

Les  Lancaster  and  Steve  Reynolds,  U.S.  Army  Operational 
Test  and  Evaluation  Agency,  Fort  Bel voir,  Virginia 

A  METHOD  OF  ESTIMATING  ERROR  VARIANCE  IN  A  NON-REPLI GATED 
EXPERIMENT  BY  PARTITIONING  AN  INTERACTION  TERM  INTO  NON¬ 
ADDITIVITY  AND  ERROR 

Lieutenant  L.  Douglas  Peirce,  Army  Logistics  Management 
Center,  School  of  Logistics  Science,  Systems  and  Cost  Analysis 
Department,  Fort  Lee,  Virginia 

PLANNING  QUANT AL  RESPONSE  TESTS  FOR  ORDNANCE  DEVICES:  THE 
TWO-POINT  STRATEGY  AND  ANALYSIS 

R.  E.  Little-,  The  University  of  Michigan -Dearborn,  School 
of  Engineering,  Dearborn,  Michigan 
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Thursday 


0830-1010  TECHNICAL  SESSION  3  --  Carroll  Auditorium 
CHAIRMAN 

Eugene  F.  Dutoit,  U.S.  Army  Infantry  School,  Directorate  of 
Combat  Developments,  Foi*t  Benning,  Georgia 

APPLICATIONS  OF  THE  MONTE  CARLO  TECHNIQUE  TO  DETERMINE  STATISTICAL 
STRESS  AND  STRAIN  RESPONSE  AROUND  CUT-OUTS  IN  COMPOSITES 

Donald  M.  Neal,  Army  Materials  and  Mechanics  Research  Center, 
Watertown,  Massachusetts 

TECHNIQUE  FOR  STATISTICALLY  DETERMINING  FLIGHT  SUITABILITY  OF 
AN  ARTILLARY  PROJECTILE 

Gertrude  Weintraub  and  Ronald  Corn,  Picatinny  Arsenal,  Dover, 

New  Jersey 

BAYESIAN  SYSTEM  RELIABILITY  GROWTH  ANALYSIS  USING  SUBSYSTEM  DATA 
John  G.  Mardo,  Picatinny  Arsenal,  Dover,  New  Jersey 
1010-1040  BREAK 

1040-1220  CLINICAL  SESSION  C  — -  Dart  Auditorium  (AFIP) 

CHAIRMAN 

R.  J.  D'Accardi,  U.S,  Army  Electronics  Command,  Fort  Monmouth 
New  Jersey 

PANELISTS 

A.  Clifford  Cohen,  Institute  of  Statistics,  University  of 
Georgia,  Athens,  Georgia 

Larry  H.  Crow,  U.S.  Army  Materiel  Systems  Analysis  Agency, 

Aberdeen  Proving  Ground,  Maryland 

Bernard  Harris,  Mathematics  Research  Center,  University  of 
Wisconsin,  Madison,  Wisconsin 

Herbert  Solomon,  Department  of  Statistics,  Stanford  University 
Stanford,  California 
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APPLICATION  OF  LIFE  TESTING  TECHNIQUES  TO  DETECTION  DATA 

Carl  B.  Bates,  U.S.  Array  Concepts  Analysis  Agency,  Bethesda, 
Maryland 
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CHAIRMAN 
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Walter  Reed  Army  Institute  of  Research,  Washington,  D.  C. 

ON  THE  ROBUSTNESS  OF  THE  EXPOTENTIAL  DISTRIBUTION 

George  C.  Canavos,  School  of  Business,  Virginia  Commonwealth 
University,  Richmond,  Virginia 

RANDOM  INTERVAL  RELIABILITY 

Gerald  R.  Andersen,  Office  AMC  Chief  Mathematician,  HQ,  U,S. 
Army  Materiel  Command,  Alexandria,  Virginia 

CONFIDENCE  INTERVALS  FOR  A  SUM  OF  RENEWAL  PROCESSES  WITH 
APPLICATION  IN  RELIABILITY 

Ronald  L.  Racicot ,  Applied  Math  &  Mechanics  Division,  Research 
Directorate,  Benet  Weapons  Laboratory,  Watervliet  Arsenal, 
Watervliet,  New  York 

STRUCTURAL  VARIANCE  ESTIMATION 

Clifford  J.  Maloney  and  Lucille  Carver,  Bureau  of  Biologies, 
FDA,  Rockville,  Maryland 

1220-1320  LUNCH  —  Officers’  Open  Mess,  WRAMC 
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CHAIRMAN 

Clifford  J.  Maloney,  Bxireau  of  Biologies,  FDA,  Bethesda,  Maryland 
PANELISTS 
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Herbert  Solomon,  Department  of  Statistics,  Stanford  University 
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John  Bart  Wilburn,  Jr.,  I&M  Branch,  U.S.  Army  Electronic 
Proving  Ground,  Fort  Huachuca,  Arizona 

OUTLIER  DETECTION  PROCEDURES  IN  TRAJECTORY  DATA  REDUCTION 

William  S.  Agee  and  Robert  H.  Turner,  Analysis  and  Computation 
Division,  National  Range  Operations  Directorate,  U.S.  Army 
White  Sands  Missile  Range,  White  Sands,  New  Mexico 

1320-1520  TECHNICAL  SESSION  5  -  Owen  Conference  Room  (AFIP) 

CHAIRMAN 

Ian*  McLean,  Armed  Forces  Institute  of  Pathology,  Washington,  D.  C. 
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Abstract.  The  interface  between  5.56mm  ball  and  tracer  bullet 
designs  and  various  rifling  configurations  are  examined  to 
determine  the  effects  on  ballistic  performance  and  mechanical 
integrity  as  would  be  experienced  under  general  purpose 
machine  gun  operational  modes. 

Two  modes  of  projectile  failure  are  examined  against 
light  machine-gun  system  design  criteria.  Based  on  these 
results,  optimum  rifling  configurations  are  identified  for 
use  in  a  machine-gun  system. 

Verification  of  these  optimized  rifling  designs  through 
experimentation  are  discussed. 

1.  Introduction.  Initial  interest  in  the  study  of  those _ 
parameters  ettecting  barrel/bullet  interface  was  generated 
at  Frankford  Arsenal  under  the  6mm  tracer  program.  At  that 
time,  the  6mm  ball  and  tracer  cartridges  were  the  prime 
ammunition  candidates  for  the  Squad  Automatic  Weapon  (SAW) , 
and  consequently  great  concern  was  expressed  at  a  high 
incidence  of  tracer  projectile  failures  (break-up)  then 
being  observed  during  both  test  barrel  and  weapon  barrel 
performance  tests. 

Table  1  categorizes  various  tracer  projectile  malfunctions 
from  four  and  six— groove,  plated  and  unplated  weapon  and  test 
barrels.  This  chart  shows  the  frequency  of  projectile  failures 
from  four-groove  plated  weapon  barrels  and  to  a  lesser  degree 
in  four-groove  plated  test  barrels. 

As  a  result  of  this  high  incidence  of  projectile  failure, 
an  analytic  stress  study  was  undertaken  to  examine  certain 
modes  of  failure  which  could  explain  the  type  of  projectile 
break-up  being  exhibited. 
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COMPARATIVE  RESULTS  SHOWING  6MM  TRACER  MALFUNCTIONS  IN  FOUR-  AND  SIX-GROOVE  BARRELS 
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2.  Stress  Evaluation >  The  typical  6nim  tracer  failure  as 
observed  in  recovered  projectiles  was  evidenced  by  a  radial 
flaring  of  the  projectile  base  and  longitudinal  separation 
of  the  projectile  jacket,  as  if  the  pyrotechnic  column 
exploded  after  muzzle  exit. 

The  modes  of  projectile  failure  examined  in  the  initial 
stress  study  were: 

a.  The  shear  deformation  or  out-of-roundness  occurring 
in  the  projectile  jacket. 

b.  The  stress  field  encountered  by  the  projectile 

jacket  after  engraving  and  during  acceleration  of  the  projectile. 

Shortly  after  the  initiation  of  the  stress  study,  DA 
guidance  was  received  eliminating  the  6mm  concept  from 
inclusion  as  a  SAW  contender.  Developmental  efforts  were 
redirected  towards  the  consideration  of  a  5.56mm  SAW 
ammunition  contender,  which  was  easily  included  in  the  analytic 
study.  Shown  in  Table  2  are  the  pertinent  projectile 
characteristics  for  the  5.56mm  concepts  under  development. 

In  selecting  an  airanunition  design  as  a  SAW  contender,  several 
design  criteria  were  applied  to  the  analysis  in  order  to 
define  the  use  of  the  projectile  and  weapon  barrel  in  a  light 
machine-gun  role.  These  design  criteria  are  outlined  in 
Table  3.  In  addition  to  these  design  parameters  addressing 
projectile  integrity,  any  interior  bore  configuration  must 
satisfy  other  basic  performance  requirements  such  as  projectile 
accuracy,  barrel  life  under  machine-gun  firing  schedules, 
interior  ballistics,  terminal  effectiveness  and  high  rate 
manufacture  by  current  methods. 

The  effect  of  shear  deformation  on  the  projectile  integrity 
was  considered  by  applying  thin-ring  theory  to  the  projectile 
jacket  with  "n"  distributed  forces  being  applied  corresponding 
to  the  number  of  lands.  The  results  of  the  analysis  indicated 
that  during  the  engraving  process  it  is  desirous  that  the 
pressure  under  the  land  be  as  large  as  possible  for  any  given 
deflection.  The  reason  for  this  is  that  the  engraving  is 
caused  by  the  jacket  material  becoming  plastic,  and  the  smaller 
the  deflection  that  is  encountered  when  the  material  goes 
plastic,  then  the  less  out-of-roundness  that  will  be  incurred 
by  the  jacket.  When  considering  this  result  relative  to  the 
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pressures  and  deflections  induced  by  four  and  six~groove 
barrels,  the  results  clearly  indicate  that  the  six-groove 
configuration  is  clearly  superior  to  the  four-groove  even 
when  comparing  a  six-groove  barrel  with  minimum  land  height 
to  a  four— groove  with  a  maximum  land  height. 

The  stress  field  developed  on  the  jacket  after  engraving 
and  during  acceleration  was  addressed  by  considering  a 
pressure  gradient  acting  from  the  bottom  to  the  top  of  the 
engraved  surface.  By  relating  this  pressure  distribution 
to  the  depth  of  engraving,  minimum  values  of  engraving  depth 
were  calculated  such  that  the  probability  of  jacket  shearing 
is  reduced.  This  minimum  depth  of  engraving  was  shown  to 
be  .0017  in.  for  the  four-groove  barrel  and  .0011  in. for  the 
six-groove.  These  minimum  engraving  depths  were  applied  to 
the  analysis  in  determing  optimvim  bore  configurations. 


Optimum  Bore  Dimensions  and  Projectile  Compatibility.  When 
considering  the  minimum  engraving  depths  required  together 
with  the  pertinent  design  criteria  and  projectile  dimensions, 
it  is  possible  to  compute  optimum  rifling  dimensions  such  that 
the  types  of  system  failures  considered  will  be  minimized. 

This  was  done  for  the  projectiles  being  developed  by  relating 
the  minimum  engraving  depths  required  such  that  jacket  shear 
does  not  take  place  as  a  function  of  projectile  diameter, 
bore  diameter,  barrel  temperature,  jacket  deformation  due  to 
engraving  and  land  wear.  This  relationship  is  shown  in 
equation  1-1. 


(1-1) 


le 


Rp  -  Rbo  (1  +  aAT  ) 


W, 


u 


Ly 


where,  le 
Rbo 
Rp 
a 
A  T 
Wb 
'^Ly 


minimum  engraving  depth  required 
bore  radius  or  land  radius 
projectile  radius 

coefficient  of  thermal  expansion 

barrel  temperature  gradient  under  hot  condition 

barrel  wear 

jacket  displacement  before  yielding 


By  solving  equation  1—1  for  Rbo,  the  land  diameter  suited 
to  each  projectile  design  can  be  found.  The  optimum  groove 
size  was  derived  such  that  the  smallest  projectile  diameter 
used  in  the  bore  will  have  the  same  diameter  as  the  groove 
at  its  highest  temperature  as  shown  in  equation  1-2.  This 
would  correspond  to  the  barrel  temperature  reached  under 
sustained  firing  schedules. 
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=  groove  diameter 

=  minimum  projectile 
diameter 

=  coefficient  of 

thermal  expansion 

=  barrel  temperature 
gradient 

The  optimum  barrel  dimensions  calculated  using  equations 
1-1  and  1-2  are  shown  in  Table  4.  Note  that  configurations 
1  and  2  are  optimum  based  on  tracer  projectiles  of  differing 
diameters  while  configuration  3  considers  an  increased  land 
height  for  larger  barrel  wear  over  configurations  1  and  2. 

Standard  5.56mm  barrel  dimensions  are  shown  as  reference. 

A  numerical  exercise  was  performed  utilizing  the  optimum 
rifling  dimensions  and  projectile  dimensions  to  demonstrate 
the  range  of  in-bore  interferences  and  clearances  possible 
under  "best"  and  "worst"  design  conditions.  Table  5  summarizes 
the  results  of  this  exercise  giving  a  range  of  interference/ 
clearance  values  for  both  standard  5.56mm  bore  configuration  and 
optimized  configurations.  To  properly  compute  these  interference/ 
clearance  values,  the  following  parameters  were  considered: 

minimum  and  maximum  bullet  diameters  (ball  and  tracer) 
minimum  and  maximum  land  and  groove  diameters 
.0005  in.  diametrical  land  wear  ^ 
dicimetrical  bore  expansion  at  1250  F 

Table  6  lists  the  equations  used  to  compute  the  ranges 
of  interference/clearance  and  minimum  land  height  values. 

In  comparing  the  standard  barrel  designs  with  the  optimized 
cases,  it  is  important  to  view  these  results  in  a  strictly 
statistical  sense  in  that  projectile  deformation  into  the 
barrel  grooves  was  not  considered.  However,  despite  the 
rather  static  condition  under  which  these  numbers  were 
generated,  a  major  difference  among  designs  can  be  noted. 

In  all  cases,  the  optimized  designs  exhibit  a  greater 
projectile/barrel  interference,  or  lesser  projectile/barrel 
clearance  than  the  standard  barrel  dimensions.  This  important 
difference  is  the  direct  result  of  attempting  to  accommodate 
differing  ball  and  tracer  projectile  diameters  while  insuring 
satisfactory  system  performance  over  a  temperature  range  from 


(1-2)  D(5  =  Dp  min  ,  where  D 

1  +  aAT 


G 

Dp 


AT 


7 


pc; 

o 

VO 

rH 

VO 

u 

o 

00 

VO 

Eh 

CM 

rH 

iH 

iH 

Pt3 

CN 

CM 

CM 

CN 

§ 

• 

• 

• 

• 

H 

1 

1 

1 

1 

P 

o 

a\ 

VO 

Q 

cn 

VO 

r- 

in 

>H 

nM 

iH 

iH 

P 

Q 

CM 

CM 

CM 

CM 

g 

P 

• 

• 

• 

• 

W 

w 

u 

w 

<I 

Pi 

u 

Pm 

w 

in 

VO 

CM 

VO 

< 

Eh 

iH 

CN 

iH 

P>q 

pq 

CM 

CM 

CN 

CN 

e 

S 

CM 

CM 

CM 

CM 

w 

s: 

• 

• 

• 

• 

Eh 

H 

H 

S 

P 

1 

1 

1 

1 

H 

w 

W 

in 

VO 

CN 

VO 

S 

> 

CO 

o 

iH 

o 

o 

o 

CM 

CN 

CM 

CN 

C 

o 

CM 

CM 

CN 

CM 

n 

Pi 

• 

• 

• 

• 

\ 

a 

Eh 

p 

W 

pq 

s; 

U 

o 

i-:i 

<1 

w 

H 

D 

cu 

ffl 

PC 

w 

CQ 

H 

EH 

VO 

VO 

VO 

VO 

< 

13 

Pi 

p 

>H 

D 

O 

H 

o 

o 

o 

o 

p 

s 

• 

• 

• 

• 

§ 

CO 

w 

1 

1 

1 

1 

P 

s; 

> 

a 

o 

o 

rr 

TP 

pq 

15 

H 

o 

r- 

C 

eh 

Pi 

o 

o 

o 

o 

0 

W 

< 

o 

• 

• 

• 

• 

W 

S 

> 

o 

IS 

o 

Q 

s 

H 

Pi 

\D 

Pm 

0 

in 

121 

Eh 

• 

o 

W 

P 

LO 

1  u 

0 

121 

H 

< 

o 

Pi 

P 

s 

H 

Pi 

W 

W 

H 

s 

rH  U 

CM  pq 

CO 

P 

§ 

iS 

O 

p 

IS 

[X4 

w 

VO 

s;  S 

< 

!2;  P  S 

< 

H 

p 

in 

O  Pi  EH 

o  Pi  2 

O  Eh  3 

P 

Pi 

• 

H  o 

H  o  EH 

H  H  P 

eh 

in 

H  Pm  P 

H  Pm 

X 

<  W 

<  H 

<  P 

H 

p 

P  Pm 

«  W  Nl 

K  W  isi 

2  pq  pq 

W 

pq 

5  W 

P  >  H 

P  >  H 

p  >  CO 

Pi 

C  P 

0  O  CO 

0  O  CO 

0  O  ^ 

Pi 

P  ^ 

H  O  « 

H  o 

H  o  pq 

•  • 

< 

Pm  Pi  W 

Pm  Pi  P 

Pm  Pi  Pi 

pq 

p 

12;  C3  P 

12;  0  P 

s;  0  u 

Eh 

EH 

o  1  s; 

O  1  < 

o  1  s; 

O 

CO 

U  VO  p 

U  VO  P 

CJ  VO  H 

12; 

8 


9 


MINIMUM  ENGRAVING  REQUIRED  FOR  SIX-GROOVE  CONFIGURATION  IS  .0011. 

MAXIMUM  INTERFERENCE  CONSIDERS  NO  WEAR  AND  NO  THERt'lAL  EXPANSION. 

BORE  DIMENSIONS  INCREASE  BY  .0019  IN.  ON  DIAMETER  DUE  TO  THERMAL  EXPANSION 
LAND  DIMENSIONS  INCREASE  BY  0.0005  IN.  ON  DIAMETE P  DUE  TO  WEAR. 
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NEGATIVE  INTERFERENCES  DENOTE  CLEARANCES 


ambient  to  1250°F.  These  design  parameters  are  further 
aggravated  by  considering  land  wear. 

Comparing  the  interferences  and  clearances  shown  in 
Table  5  with  the  minimum  required  land  engagement  of  .0011 
in.  for  six-groove  configurations  shows  possible  problem 
areas.  Despite  the  fact  that  the  minimum  land  heights  under 
worst  conditions  exceed  this  .0011  in.  requirement,  it  is 
not  necessarily  true  that  proper  engraving  will  occur.  This 
situation  occurs  in  the  5.56mm  standard  six-groove  design, 
for  both  ball  and  tracer  comparisons.  Although  the  minimum 
land  height  at  1250  F  is  adequate  for  the  required  .0011  in. 
engraving,  this  engraving  cannot  occur  if  the  projectile/ 
land  interferences  run  as  low  as  .0005  in.,  as  it  does  for 
the  tracer.  This  minimal  interference  could  lead  to  a 
serious  skidding  problem. 

Experimental  Evaluation.  The  accuracy  of  the  analysis, 
as  well as  the  suitability  of  any  barrel  design  to  field  use 
can  only  be  verified  through  extensive  testing.  Toward 
this  end,  a  quantity  of  barrels  of  various  configurations 
has  been  procured  for  evaluation  of  system  performance 
levels.  Table  7  is  a  matrix  showing  the  quantity  and  types 
of  barrels  which  will  be  the  core  of  an  exhaustive  barrel 
performance  program.  These  barrels  will  be  tested  along 
with  approximately  45,000  rounds  of  5.56mm  ball  and  tracer 
ammunition  against  current  SAW  performance  requirements 
so  that  sufficient  statistical  significance  is  obtained, 
pointing  to  a  singular  rifling  configuration. 

Plans  for  testing  currently  envision  adhering  to  current 
acceptance  standards  for  5.56mm  and  7.62mm  ammunition  and 
will  mirror  sample  sizes  of  barrels  and  ammunition  contained 
therein. 
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TABLE  7 


5.56MM  (SAW)  AMMUNITION/WEAPON  INTERFACE 
BARREL  MATRIX 


BARREL 

TYPE 

BORE 

ACCURACY 

PRESSURE 

WEAPON* 

(CHROMED) 

WEAPOl 

(UNCHROI 

configuratioJk. 

QUANTITY 

STANDARD  5.56MM 

RIFLING 

2 

2 

3 

2 

6-GROOVE  BORE 

1  IN  12  TWIST 
UNDERSIZED  TRACER 
(CONFIG.  1) 

2 

2 

3 

2 

6-GROOVE  BORE 

1  IN  11  TWIST 
UNDERSIZED  TRACER 
(CONFIG.  1) 

2 

2 

3 

2 

6 -GROOVE  BORE 

1  IN  12  TWIST 

BALL  SIZE  TRACER 
(CONFIG.  2) 

2 

2 

3 

2 

6-GROOVE  BORE 

1  IN  11  TWIST 

BALL  SIZE  TRACER 
(CONFIG.  2) 

2 

2 

3 

2 

6-GROOVE  BORE 

1  IN  11  TWIST 

INCREASED  LAND  HEIGHT 
FOR  ECCENTRICITY 
(CONFIG.  3) 

2 

2 

3 

2 
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DESIGN  OF  EXPERIMENTS  DEALING  WITH  MAN-MACHINE  INTERFACE 
IN  CURRENT  COMMUNICATIONS  SYSTEMS 

R.  J.  D'Accardi  and  H.  S.  Bennett,  U.S.  Electronics  Command, 

Fort  Monmouth,  New  Jersey 

J.  R.  Hennessy,  U.S.  Army  MERDC,  Fore  Belvoir,  Virginia 

ABSTRACT .  Recently,  the  US  Army  Electronics  Command  has  supported  experiments 
dealing  with  man-machine  interface  problems  occurring  in  Tactical  Communications 
Systems.  The  aim  was  to  characterize  communications  system  operators'  per¬ 
formance  under  various  environmental  conditions  related  to  tactical  operations. 
The  study  was  directed  towards  system  equipment  such  as  the  standard  teletype 
and  optical -read-only  terminal  equipments.  Using  these  devices,  the  signifi¬ 
cance  of  acoustic  noise  and  ambient  light  on  operator  performance  was  studied 
under  sixteen  combinations  of  environmental  conditions. 

The  object  of  this  presentation  is  threefold.  First,  we  discuss  the  methods 
of  evaluating  message  transfer  over  man-machine  interfaces  to  include  audio 
and  visual.  Second,  we  discuss  the  design  of  the  experiment  and  modeling  to 
determine  the  operator  characteristics  under  different  environmental  conditions, 
and  third,  we  present  statistical  estimates  of:  (a)  the  effects  of  the 
controlled  variables  (ambient  light  and  acoustic  noise)  upon  the  transcription 
accuracy  of  several  operators,  (b)  measures  of  experimental  error  to  define 
a  range  of  values,  for  a  prescribed  level  of  confidence,  within  which  the 
true  value  of  the  estimates  may  be  found,  and  (c)  the  most  significant 
combinations  of  environmental  effects  on  operator  performance.  Several  multi¬ 
variate  regression  models  which  characterize  operator  performance  are 
presented  and  the  criteria  for  choosing  the  best  model  are  discussed. 

INTRODUCTION.  Information  gained  in  evaluating  and  solving  man-machine 
interface  problems  that  occur  in  complex  communications  systems  is  extremely 
important  to  systems  engineers  committed  to  the  mission  of  the  design  and 
fabrication  of  future  generations  of  equipment.  Sophisticated  systems  of 
Command  and  Control,  computer-aided  man-in-the-loop  systems  (e.g.,  manned 
space  craft),  human  response  to  audio  and  visual  displays,  management  functions, 
pattern  recognition,  man-computer  languages,  cutaneous  communication  and  many 
other  facets  are  of  concern  where  an  operator  must  perform  a  control  task,  or 
decision  task.  At  present  there  is  a  large  volume  of  on-going  work  oriented 
towards  man-machine  interfaces  which  span  the  projected  needs  of  the  Armed 
Forces.  For  example,  work  in  progress  by  the  Naval  Electronics  Systems 
Cottmand,  6570th  Aerospace  Medical  Research  Laboratory,  DA  ARI  for  the 
Behavioral  Sciences,  ECOM  and  HEL  (to  name  a  few)  generally  deal  with  evalu¬ 
ation  of  complex  system  interfaces,  assessment  of  operator  performance 
capabilities  for  a  wide  variety  of  tasks,  analysis  of  manual  functions  into 
tasks,  analysis  of  human  control  functions,  and  the  physical  and  psychological 
characteristics  which  affect  the  assessment  of  operator  performance  capa¬ 
bilities.  Much  of  the  on-going  work  concerns  the  psychological  and 
physiological  aspects  of  command  and  control  in  tactical  operations,  weapons 
systems,  vehicles  management,  logistics,  and  communications.  Some  of  the  more 
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specific  areas  of  investigation  are: 

1.  Work/rest  schedules  and  effects  on  man-machine  performance. 

2.  Utilization  of  Bio-electric  phenomena  to  automatically  control 
complex  systems. 

3.  Measures  of  operator  performance  under  different  mixes  of  equipment^ 
personnel  and  procedures. 

4.  Physiological  aspects  (fatigue,  alertness,  metabolism,  endocrine 
gland  functions,  and  central  nervous  system)  of  operator  efficiency 
and  man-machine  interface. 

5.  System  simulation  to  study  the  impact  of  operator  performance  on 
complex  systems  as  a  function  of  environmental  threat,  mission,  and 
work  load  stress. 

6.  Army  Tactical  Flight  operations  under  adverse  visibility  conditions. 

7.  Influence  of  USAF  operational  environments  on  air  crew  utilization. 

Examination  of  ongoing  research  in  these  areas  indicate  that  there  is 
no  clear  cut  procedure  to  evaluate  the  human  subsystem  in  a  sophisticated 
communications  system  or  the  effects  of  environmental  stress  on  operator 
performance.  Army  communications  requirements  in  a  tactical  situation  often 
require  24  hour  operations  and  personnel  are  required  to  work  either  on 
standard  or  unpatterned  and  frequently  extended  duty  schedules,  in  a  variety 
of  environments,  each  characterized  by  multiple  stresses  occurring  in  a  ^ 
random  manner.  For  example,  the  accuracy  in  reading  an  optical  display  is 
dependent  on  many  variables  such  as  number  of  lines,  characters,  ambient 
lighting,  environmental  noise,  speed  of  display,  correction  time,  back-log, 
operator  physiology  (e.g.,  mood,  fatigue,  attention,  and  training),  display 
brightness  and  size,  and  effective  signal- to-noise  ratio  (legibility)  to 
Name  a  few.  Since  future  Army  requirements  include  optical  display  terminals, 
it  is  essential  to  provide  insight  into  those  variables  that  affect  accuracy 
through  the  man-machine  interface  and  the  effects  caused  by  physiological 
factors.  To  answer  the  Army's  need  for  measures  of  man-machine  interfaces 
which  occur  in  communications  systems  and  to  enhance  the  design  of  future 
families  of  equipment,  this  report  will  address  teletype  operator  per¬ 
formance  as  the  environmental  factors  of  ambient  light  and  acoustic  noise 
are  varied.  The  design  of  the  experiment  performed  at  Ft.  Monmouth,  New 
Jersey  during  April  and  May  1975  and  results  are  discussed.  Experimental 
results  and  several  models  are  presented  which  show  the  significance  of 
these  variables  on  experienced  teletype  operators. 
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DESIGN  OF  THE  EXPERIMENT.  The  significance  of  acoustic  noise  and  ambient 
light  on  operator  performance  was  investigated  using  a  visual  display 
transmission  device,  see  figure  1.  This  is  a  visual  terminal  designed  to 
interface  with  computers  or  store-and-forward  devices.  Primarily,  it  is 
a  developmental  equipment  intended  to  visually  present  messages  on  a  CRT 
display  where  an  operator  can  see  and  correct  his  message  prior  to  transmission. 
The  advantages  of  this  equipment  over  the  standard  military  teletypewriter 
were  not  addressed  in  this  experiment. 

The  experiment  consisted  of  testing  the  transcription  accuracy  of  six 
experienced  communications-center  operators  under  16  combinations  of 
environmental  conditions.  Ambient  light  was  varied  at  four  levels,  ranging 
from  24  ft -candles  to  3  ft -candles,  and  acoustic  noise  was  concurrently 
varied  at  four  levels  ranging  from  55  dBa  to  95  dBa.  Sound  pressure  level 
(SPL)  measured  in  dBa  is  in  reference  to  .0002  dynes/cm^.  This  is  con¬ 
sidered  the  threshhold  of  hearing  and  is  roughly  equivalent  to  a  leaf 
"falling"  on  a  quiet  day.  The  55dBa  level  was  considered  the  quiet 
condition  where  only  the  inherent  noise  from  the  terminal  equipment,  sound 
room  noise,  and  thermal  noise  were  recorded.  The  95dBa  level  represented 
an  extremely  annoying  and  distracting  "pink"  noise.  The  noise-power  per 
unit  frequency  for  this  type  of  noise  is  inversely  proportioned  to  frequency 
over  a  specified  range  and  slopes  down  at  3dB  per  octave  from  20Hz  to  20KHz. 
Thesfe  characteristics  are  more  common  to  conference  type  noise  where  the 
higher  and  lower  frequency  components  characterize  motor  and  equipment 
noises.  Pink  noise  was  also  used  because  it  has  relatively  constant  energy 
per  octave-bandwidth.  The  24  ft-candle  light  level  compared  favorably  to  the 
Army  Corps  of  Engineers  standard  for  office  lighting.  The  other  chosen  levels 
of  12,  6  and  3  ft-candles,  respectively,  represented  successively  deteriorating 
ambient  light  conditions.  Throughout  the  testing,  the  brightness  of  the  visual 
display  was  constant. 

For  each  test  the  operator  was  required  to  type  his  name,  treatment 
combination,  and  date  as  part  of  the  message,  see  figure  2.  The  messages  for 
the  experiment  consisted  of  forty  random-letter  word  groups  of  five 
characters  each.  They  were  derived  through  a  random  number  generator  and  an 
alphanumeric  conversion.  No  message  was  a  duplicate  nor  were  they  duplicated 
by  any  of  the  operators  on  either  terminal  equipment.  The  random  letter 
format  was  used  so  that  the  operator  could  not  identify  or  recognize  message 
words  and  therefore  would  have  to  concentrate  on  the  given  formats  to  avoid 
making  transcription  errors.  The  aim  of  the  experiment  was  to  vary  the 
environmental  variables  and  to  observe  the  accuracy  and  speed  of  transcribing 
the  random  letter  formats  as  a  function  of  these  variables.  The  response 
variable,  accuracy,  was  the  measure  of  transcription  errors  that  each  operator 
committed  per  message  format.  The  errors  considered  were  the  following: 

1.  transposition 

2.  missing  letter 

3 .  extra  1 etter 
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Figure  2  -  Message  Format 


4.  incorrect  space 

5.  extra  line  feed 

6.  missing  word  groups 

7.  wrong  letter 

8.  line  out  of  sequence  (skipped  line  inserted  after  detection) 

9.  word  group  out  of  sequence 

The  results  were  compared  to  an  acceptable  operator  norm,  i.e.,  typing  a 
message  format  on  a  standard  teletype  terminal  (see  figure  3)  under 
the  same  conditions.  Each  operator  was  tested  in  four  sessions,  each  session 
programmed  for  eight  random  environmental  combinations,  four  for  each 
terminal  equipment,  where  tests  were  alternated  between  the  optical  display 
and  the  standard  teletypewriter.  This  was  done  to  reduce  the  effects  of  learning. 
A  thirty  minute  familiarization  period  was  given  each  operator  prior  to  the 
tests,  and  a  standard  instruction  sheet  was  distributed  during  this  period 
to  insure  uniform  orientation  with  the  equipment  and  with  the  purpose  and 
procedure  of  the  experiment. 

The  effect  of  any  environmental  combination  is  considered  to  be  the  sum 
of  three  effects,  namely,  those  of  sound,  light,  and  the  interaction  of 
light  and  sound.  To  adequately  analyze  these  effects,  a  two-level  factorial 
experiment  was  formulated  with  six  replications.  The  four  levels  of  acoustic 
noise  are  combined  with  the  four  levels  of  ambient  light  giving  4  x  4  or  sixteen 
treatment  combinations.  For  a  two-factor  factorial  experiment  with  n 
observations  per  cell,  run  as  a  completely  randomized  design,  |_1]  ,  L^J  >  ® 
general  model  is: 

Yijk  =  y  +  A,-  +  Bj  +  A^-Bj  + 

where  Y  is  the  response  variable,  i.e.,  the  number  of  transcribed  errors,  and 
A  and  B  are  the  main  effects  of  light  and  sound,  AB  is  their  interaction,  e  is 
the  experimental  error,  (i.e.,  the  extent  to  which  the  observed  data  and  the 
general  model  disagree)  and  their  respective  levels  are  i  =  1,2, 3, 4;  j  =  1.2. 3, 4, 

with  k  =  1,2 - 6  observations  per  cell.  The  interaction  term  adjusts  for  the 

failure  of  either  one  of  the  main  effects  to  remain  constant  for  each  level 
of  the  other.  The  test  runs  were  randomized  as  shown  in  table  I.  This  was 
done  to  minimize  the  effects  of  training. 
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TABLE  1 


TREATMENT  SCHEDULE  PER  OPERATOR 


Environmental  Treatment 

Combinations 

Optical 

Teletype 

Session 

Run 

Display  Terminal 

Terminal 

I 

1 

1,4 

3,1 

2 

4,3 

4,4 

3 

3,2 

2,2 

4 

2,1 

1,3 

II 

5 

3,1 

4,1 

6 

4,4 

1,2 

7 

2,2 

3,4 

8 

1,3 

2,3 

III 

9 

4,1 

2,4 

10 

1,2 

3,3 

11 

3,4 

1,1 

12 

2,3 

4,2 

IV 

13 

2,4 

1,4 

14 

3,3 

4,3 

15 

1,1 

3,2 

16 

4,2 

2,1 

(Treatment  =  (Ambient  Light  Level,  Acoustic  Noise  Level) 


Ambient  Light 
Level  Value 

1  24  ft-candles 

2  12  ft-candles 

3  6  ft-candles 

4  3  ft-candles 


Acoustic  Noise 
Level  Value 

1  55  dBa 

2  70  dBa 

3  80  dBa 

4  95  dBa 
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ANALYSIS:  The  following  ANOVA  tables  and  statistical  estimates  were  formu¬ 
lated  to  analyze  the  transcribed  errors  for  the  standard  teletype  terminal  and 
for  the  optical  display  terminal  (tables  II,  III,  IV  and  V): 

TABLE  II 

ANOVA  FOR  STANDARD  TELETYPE  TERMINAL 


Source 

Sum  or  Squares 

Ambient  Light,  A.,- 

55.94 

Acoustic  Noise,  B. 

99.70 

Interaction,  A.B. 

1  J 

109.93 

Error,  E, , . 

k(ij) 

4494.67 

Degrees  of 
Freedom 

Mean  Square  Error 

"F"  ratio 

3 

18.65 

0.33 

3 

33.23 

0.59 

9 

12.21 

0.22 

80 

56.18 

TOTAL 


4760.24 


Source 


TABLE  III 

ANOVA  FOR  THE  OPTICAL  DISPLAY  TERMINAL 


Sum  of  Squares 


Degrees  of 
Freedom 


Mean  Square  Error  "F"  ratio 


Ambient  Light 

65.28 

3 

21.76 

0.32 

Acoustic  Noise 

276.03 

3 

92.01 

1.35 

Interaction 

55.18 

9 

6.13 

.10 

Error 

5437.50 

80 

67.97 
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TABLE  IV 


STATISTICAL  ESTIMATES  OF  TRANSCRIBED  ERRORS 
FOR  THE  TELETYPE  TERMINAL 


Ambient 

Acoustic 

Noise  Level 

For  All 

Liqht  Level 

Statistic 

55dBa 

70  dBa 

80  dBa 

95  dBa 

Sound  Levels 

24  ft-candles 

T 

3.0 

5.8 

5.8 

6.2 

5.7 

s„ 

1.87 

3.96 

3.7 

6.42 

4.23 

S^ 

Y 

0.84 

1.77 

1.66 

2.87 

0.95 

12  ft-candles 

T 

2.2 

6.8 

6.8 

9.8 

6.4 

Sv 

2.17 

2.59 

5.54 

8.47 

5.63 

sX 

Y 

0.97 

1.16 

2.48 

3.79 

’  26 

6  ft-candles 

Y 

5.0 

3.8 

5.0 

7.2 

5.25 

Sv 

3.94 

2.59 

6.2 

4.6 

4.34 

4 

1.76 

1.16 

2.77 

2.06 

0.97 

3  ft-candles 

Y 

4.4 

4.0 

3.8 

4.2 

4.10 

Sw 

3.36 

4.95 

3.03 

1.79 

3.19 

4 

1.50 

2.21 

1.36 

0.80 

0.71 

Overall 

For  All  Light 

Y 

4.15 

5.10 

5.35 

6.85 

5.36 

Levels 

Sv 

3.30 

3.60 

4.55 

5.76 

4.43 

4 

0.74 

0.80 

1.02 

1.29 

0.50 
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TABLE  V 

STATISTICAL  ESTIMATES  OF  TRANSCRIBED  ERRORS  FOR  THE 

VISUAL  DISPLAY  TERMINAL 

Ambient 

Liqht  Level 

Statistic 

Acoustic  Noise  Level 

55  dBa  70  dBa  80  dBa 

95  dBa 

iFor  All 

Sound  Levels 

24  ft-candles 

T 

3.4 

5.80 

6.20 

9.2 

6.15 

Sv 

2.7 

4.76 

5.17 

4.82 

4.61 

1.21 

2.13 

2.31 

2.15 

1.03 

12  ft-candles 

Y 

6.8 

5.0 

7.0 

8.60 

6.9 

S„ 

3.77 

2.77 

2.45 

6.35 

3.99 

sl 

1.69 

1.24 

1.10 

2.84 

0.89 

Y 

' '  '  ”  . 

6  ft-candles 

1 

5.0 

5.2 

6.2 

5.6 

5.46 

S 

3.16 

2.39 

3.96 

4.16  : 

3.28 

S^ 

1.41 

1.07 

1.77 

1.86 

0.73 

3  ft-candles 

T 

6.0 

5.2 

5.4 

8.2 

6.2 

3.67 

3.42 

5.5 

4.71  . 

4.23 

si 

1.64 

1.53 

2.46 

2.11 

0.94 

Y 

Overall 

For  All  Light 

Y 

5.3 

5.35 

6.20 

7.90 

6.19 

Levels 

Sv 

3.34 

3.18 

4.11 

4.87 

4.00 

4 

0.75 

0.71 

0.92 

i;o9 

0.45 

Although  one  might  expect  that  acoustic  noise  and  ambient  light  would 
strongly  affect  the  production  of  transcription  errors*  no  conclusive 
statistical  significance  as  to  environmental  effects  can  be  adjudged 
from  the  data.  Examination  of  the  MSE,  however,  shows  that  acoustic  noise 
has  a  stronger  effect  on  error  production  than  either  the  Ambient  Light  or 
the  interaction  of  the  two  (see  tables  II  and  III).  Tables  IV  and  V  show, 
for  all  light  levels,  the  average  transcription  error  production 
increased  by  about  60%.  For  all  sound  levels,  the  transcription 
error  did  not  vary  significantly. 

The  operators  chosen  were  all  of  the  same  minimum  proficiency,  each 
able  to  transcribe  messages  at  60  w.p.m.,  with  the  exception  of  one 
trainee.  Thus,  examining  the  variation  of  transcription  errors  for  the 
visual  display  terminal  at  70  dBa  (see  table  V)  for  light  levels  below 
24-ft  candles,  the  mean  T  and  standard  deviation,  Sy,  decrease  from  the 
55  dBa  values,  then  increase  as  noise  is  increased  to  95  dBa. 


23 


Interviews  with  the  subjects  seem  to  indicate  that  70  dBa  is  the  approxi¬ 
mate  level  of  noise  to  which  they  are  accustomed,  and  therefore  they  were 
less  distracted  by  environmental  changes  in  ambient  light  at  this  sound 
level.  The  findings  indicate  that  for  the  visual  display  terminal  under 
quiet  conditions  (i.e.,  at  55  dBa,  the  noise  below  standard  comcenter 
Operational  levels)  at  lower  levels  of  Ambient  Light,  more  errors  were 
made  than  at  normal  operating  (70dBa)  level.  The  effect  of  noise  at  the 
higher  levels  (80  and  95  dBa)  indicates  the  variability  and  adaptability 
of  the  operators  to  acoustic  and  photic  noise.  It  was  also  noted  (as  was 
expected  with  the  visual  display  terminal)  that  changing  light  levels  had 
the  least  effect  on  operator  performance. 


Six  multiple  linear  and  non-linear  regression  models  were  fitted  to 
the  data,  by  the  least  squares  method,  to  characterize  operator  performance. 
The  models  were  of  the  form: 


(1)  Y  =  Bo  +  BiXi  +  B2X2  +  £12 

(2)  Y  =  Bo  +  BiXi  +  82X2  +  ^  X1X2  +  £12 

K3)  Y  =  Bo  +  BjXi  +  B2X2  +  8^X1  +  M2  +  35X1X2  +  £12 

(4)  Y  =  Bo  +  BjXj  +  82)^2  +  ^Xf  +  84X2  +  BsX'i  +  86)^^.+  87X1X2 

+  85X2X2  +  BsX^Xf  t  £12 

(5)  Y  =  Z  B.xjx^  +  £.  o<j+k|<3 

(6)  Y  =  Bo  +  8ilnXi  t  B2X2  +  8  ln'‘x  +  84X2  +  85X210X1  t  £12 

3  « 


Where  Y  is  the  observed  operator  response,  Xj  and  X^  are  independent 
variables  corresponding  to  ambient  light  and  acoustic  noise  respectively. 
The  estimated  values  of  the  coefficients,  standard  errors  of  the  estimates, 
and  coefficients  of  determination  are  summarized  in  the  following  table: 
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Least  Squares  Estimates  Using  Coded  and  Uncoded  Data  for  the 
Optical  Display  Terminal 


Estimate 

1 

1 

2 

Model 

3 

4 

5 

Uncoded 

6 

Uncoded 

A 

eo 

7.078 

7.078 

6.785 

6.684 

21.049 

13.715 

A 

e 

1 

0.100 

.100 

0.190 

1.752 

-3.495 

-8.045 

0 

2 

0.680 

.680 

0.680 

0.655 

-0.362 
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variance,  ^  ,  and  maximum  coefficient  of  determination, 
(Y-Y) 


This  provides  the  model: 


Y 

1 


6.785  +  1.752Xi  +  0.655X2  +  0.449XJ. 


+  O.nOX^  +  0.225X\  -  0.543)^2 
2  * 

+  0.232X1X2  -  0.076X2X2  -  O.lOSXjXf 


Testing  for  fit,  the  sum  squared  error  due  to  regression  and  the  respective 
degrees  of  freedom  for  the  variation  of  Yj  from  the  curve  are  3.378  and 
{9,6}  respectively.  If  the  model  Is  correct,  the  residual  mean  square  has 
the  expected  value  of  a^.  Using  =0^  ,=  0.5187  =  MS  ,  the  "F"  ratio  is: 

y  (y-y)  "  " 


F  =  M$c  =  3.378  =  3.907 

MS  0.518 


and  Is  not  significant  since  3.907  <  5.520.  Thus,  on  the  basis  of  minimum 
$2  ,  maximum  ^  and  this  test,  we  have  no  reason  to  doubt  the  adequacy 

(y-jO ,  yy 

of  this  particular  model.  This  technique  is  presented  to  show  the  feasibility 
of  using  multiple  least  squares  regression  for  this  type  of  man-machine 
Interface  problem.  A  more  sophisticated  approach  is  planned  at  a  later  time 
when  more  data  is  obtained. 

Conclusions:  Several  adverse  aspects  of  the  terminal  equipment  were 
discovered  which  may  affect  error  production.  The  angle  of  the  keyboard 
(see  figures  4  and  5)  of  the  visual  display  terminal  was  apparently  not 
conducive  to  optimum  performance.  The  teletypewriter  keyboard  was 
unanimously  considered  more  comfortable.  Also,  the  detent  pressure  of 
the  individual  keys  and  the  absence  of  feedback  "thump"  seemed  to  increase 
the  probability  of  transcription  error  with  the  visual  display  terminal. 

While  the  results  do  not  show  statistical  significance  of  the  environmental 
effects,  the  trends  in  the  statistics  (particularly  the  MSE  and  overall  means, 
jsee  tables  II,  III,  IV  and  V)  indicate  the  possibility  that  with  a  larger 
population  of  more  homogeneous  (as  to  expertise)  subjects,  statistical 
significance  will  emerge.  That  is,  the  variations  in  human  performance  will 
be  greater  under  abnormal  environmental  conditions.  If  such  abnormal 
conditions  are  to  be  expected  under  battlefield  conditions,  then  significant 

training  information  could  be  extracted  from  such  a  follow-on  experiment. 
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Figure  4  -  Visual  Display  Terminal, 
Keyboard  Si deview 


Another  measure  that  could  attain  statistical  significance  is  the  mean 
transcription  error  production  for  the  group.  Such  statistics  would 
indicate  the  outer  bounds  of  expectation  under  battlefield  conditions. 
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PLANNING  FOR  THE  MEASUREMENT  OF  FLIGHT  TE^AJECTORY 


J.  B.  GOSE 
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Quality  Assurance  Office 
US  Amy  White  Sands  Missile  Range 
New  Mexico 


ABSTRACT.  This  paper  describes  a  procedure  used  at  White  Sands 
Missile  Range,  New  Mexico  for  selecting  instruments  to  measwe  a  test 
object's  location  and  body  angles.  Criteria  for  selection  include 
number  and  location  of  instruments,  types  and  quality  of  measurements, 
probability  of  operation ,  and  data  reduction  procedures .  Optimizations 
are  nade  in  terms  of  cost— to— support ,  probability  of  success ,  eiqjected 
error  in  data  and  instipumentation  system  used.  Constraints  include 
expected  trajectory  and  object  dimensions,  optical  image  size  and  a^ect 
angle,  tracking  rate,  atmospheric  distortion,  and  for  sane  applications, 
locations  of  existing  facilities. 

The  procedure  enploys  both  theoretically  and  pragmatic^ly  derived 
models  and  utilizes  observed  error  distribution  and  reliability  data. 

It  has  been  autonated  for  ccmputation  on  a  UNIVAC  1108  computer. 

1.  INTRODUCTION.  The  purpose  of  this  report  is  to  outline  the_ 
TtH-hhPTTiai-Tpal  statistical  scheme  used  for  the  Resource  Conservation 
Planning  (RCP)  Model.  The  RCP  is  used  as  a  tool  for  evaluating  and 
formulating  test  support  plans . *  The  model  developed  is  formulated 
fron  the  multi-station  solution  now  in  use  at  WSMR,  better  known  ^  the 
Davis  Solution.^  This  is  a  least-squares  solution  wi^ch  is  identic^ 

“to  the  naximum  likelihood  estimates  of  missile  position  in  the  particular 
case  in  which  the  instrumentation  measurements  are  normally  distributed. 

In  1965,  ILT  Charles  A.  Hall,  PhD,  expanded  the  least-squares  formulation 
to  provide  an  improved  estimate  and  to  minimize  the  number  of  obsewations 
required.  This  concept  became  known  as  Minimal  Station  Participation 
(MSEAR).^  The  RCP  is  an  extension  of  this  concept.  The  scheme  has  been 


ij.  V.  Carrillo  and  R.  L.  Garcia,  A  Technique  for  Canputing  The 
Pro>v^'hi lity  of  Meeting  a  User's  Trajectory  Requirement,  QA  Technical 
Report  No.  121,  (WSMR^  NM,  1975).  ' 

^R.  C.  Davis,  Techniques  for  the  Statistical  Analysis  of  Cinetheodolite 
Data,  (China  Lake,  California,  1951),  page  1. 

®C.  A.  Hall,  Deleting  Observations  From  a  Least-Squares  Solution, 
Proceeding  of  the  Eleventh  Conference  on  the  Design  E5q)eriments  in  Amy 
Research  Development  and  Testing,  ARD-D  Rpt  66-2,  (Durham,  NC,  1966). 


31 


adapted  to  cinetheodolites ,  Telescopes,  Radar,  and  DOVAP  for  position 
and  attitude  applications .  The  RCP  model  uses  for  input  empirically 
developed  measurement  error  probability  tables  frcm  each  measurement 
system,  a  proposed  flight  test  trajectory  of  a  specified  test  object, 
and  the  uncertainty  (flight  test  requirements)  in  the  flight  test  data 
that  a  Range  User  can  tolerate  in  his  experiment.  The  probability  tables 
are  used  to  compute  the  probability  of  a  particular  data  error  for  a 
selected  or  given  geometry  configuration.  The  final  output  is  in  terms 
of  the  probability  of  meeting  a  particular  Range  User  requirement  . 

Hence,  cost-to-support  trade-offs  can  be  developed  based  on  the  risk 
a  user  may  want  to  take  in  completing  his  experiment.  The  less  risk 
the  user  can  accept,  the  higher  the  support  cost. 

Restating  the  problem  as:  "Determine  the  probability  of  satisfying 
a  Range  User’s  requirement  for  a  test  object’s  position  and/or  attitude 
over  a  given  interval,  such  that  the  results  will  allow  cost  trade-off 
analyses." 

The  problem  statement  gives  rise  to  the  specific  questions  of  how  to 
identify  the  minimum  set?  How  to  find  the  probability  of  success?  ^^d 
How  to  solve  the  problem  with  a  computer?  The  approach  taken  obviates 
iSe  need  to  answer  the  first  question  (as  we  shall  see).  The  latter  two 
are  the  substance  of  this  paper. 

2.  ESTIMATION  OF  THE  PROBABILITY  OF  SUCCESS.  Error  estimates  can 
be  described  probabilistically  and,  of  course,  reliabilities  are 
probabilities.  Thus,  they  can  be  combined  in  a  probabilistic  formulation. 
The  probabilities  involved  in  the  estimation  of  meeting  a  requirement  for 
one  point  of  a  trajectory  can  be  expressed  in  equation  form  as: 


M 

P(R(}nt).  =  J  £3  )  X  P(Sta  Opr)].  (Eq  l) 

r  u  X  r 


where , 

P(Rgnt)j^  =  Probability  of  meeting  the  requirement  at  the  ith  point 
=  Error  in  observed  data 

S  ^  =  Rtoimum  allowable  error  from  the  requirement 
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P(Sta  Opo?)  =  The  probability  of  successful  station  operation 

M  =  E(^) 
c 

vihere  C  ”  2^  3j  x 

X  =  total  number  of  sites  available 


The  probability  for  the  entire  trajectory  is  the  distribution  of  the 
chances  for  success  at  all  points  ffon  the  population  of  occunrences 
and  is  found  by  simply  averaging  the  risk  over  all  points: 


R 

I  P(Rqntt)^ 

P(Rgirt)  =  ^ - -  (Eq  2) 

R 

vhere 

R  =  the  number  of  trajectory  points. 

The  only  unknown  parameter  in  Equation  1  is  Oq*  is  found  in  the 

following  manner. 

The  basic  regression  relationship  is 

(^  =  Be 

where, 

<|»  =  ^fatrix  of  Observations 

B  =  Jacobian  ^fatrix 

0  =  Matrix  of  Derived  Trajectory  Data 
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Solving  for  0 


0  =  B*^W(|)  (W  =  Weight  l^trix) 

0  (P 

Or}  -  for  W  =  Ca,^]  ^ 

0  <p 

0.2  =  a.2(B^B)"i  (Eq  3) 

0  9 


This  last  equation  (Eq  3)  defines  the  data  error  in  terms  of  Geometric 
Dilution  of  Precision  (GDOP)  and  measurement  error;  both  of  which  are 

known  or  knowable.  For  a  given  geometry,  (b'^B)”^  is  deterministic 
vMle  is  probabilistic.  Thus,  the  probabilistic  nature  of 

is  dependent  on  the  probabilistic  nature  of  . 

In  actual  practice,  a  requirement,  S  is  defined  as  the  trace  of 

a  variance-covariance  matrix.  We  may,  therefore,  attack  the  heuristic 
nature  of  simply  by  introducing  a  scalar  "s^”. 
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(Eq  4) 


Equation  1  beccsnes 


M 

PCRqmt)^  =  I  [P(S^  <  sa^)  x  p(Sta  Opr)]^ 


Hie  fannula  for  ccraputing  'die  probability  that  ejactly  M  of  N  scheduled 
instruments  operate  successfully  is: 


P(StaOpr)  = 


%-M+l*%-M+2*  * 


(Eq  5) 


viiere:  R  ,  R  ,  R^  are  the  reliability  values  for  instrument  1,  2, 

12  n  t 

3,  ...»  M.  Q^,  Q^,  ....  Qj^  are  the  (1-R^),  (1-R^)>  ...»  (1-Rjj)  values 
for  ^ch  of  the  instruments,  respectively.  Note  that  there  are 

N! 


terms  to  be  added  in  Eq  5. 

An  exanple  of  the  carputational  procedure  for  a  point  is  ^cwn  in 
i^pendix  1. 

3.  ITITING  IHE  M3DEL  ON  TSE  COMPUTER.  A  little  thought  on  the 
mmpntai-inTMl  tiitip.R  for  Pquation  5  will  lead  one  to  the  realization  that 
the  time  vd.ll  approximately  double  for  each  additional  site  added.  This 
was  verified  for  the  program  prepared  for  the  UNIVAC  1108  ccanputer:  A  5 
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station  solution  taking  2  seconds,  11  stations  taking  1  minute,  15 
stations  taking  14  minutes,  etc.  Alternatives  to  minimize  this  problem 
were  (1)  to  inprove  the  speed  of  each  computation  or  (2)  to  reduce  the 
number  of  candidate  sites.  The  latter  course  was  pursued. 

An  initial  screening  was  derived  based  on  instruments  operating 
limitations . 


OPTICS  -  Elevation  Angle  -  Between  3°  and  80° 

Image  Size  -  >35  Microns  (y)  for  Position 
>100  Microns  (y)  for  Attitude 


RADAR  S  DOVAP  -  Elevation  Angle  -  Between  10°  and  80° 


Next,  each  surviving  site  is  ordered  in  accordance  with  its 
contribution  to  the  error.  For  each  point,  an  error  constant^ 
calculated  from: 


is 


D4  =  I  for  the  jth  site 

KL  ^ 


K  is  an  index  of  observation  (cf)) 

L  is  an  index  of  computed  values  (0) 
L^l,  2,  3,  ...,  0 


and 


W.  is  a  weight  matrix  from  a.^W  =  I 
D  (p 

frcsn  =  a^^(B^B)~^ 


''c.f.,  Ref  1 
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D.’s  then  relate  to  from 
1  ° 


0  TR 


A 

I  D . 


vSiere, 

A  =  The  set  of  sites  used 

L  =  3  for  Position  data 
2  for  Attitude  data 


The  D-'s  vary  with  GDOP,  therefore,  Idle  largest  value  at  one  point  may  be 

smaller  than  the  smallest  value  at  another  point.  Since  all  points  are 
assumedly  of  equal  importance  to  a  customer,  the  GDOP  effect  (D^'s)  must 


be  normalized.  This  is  accoi^ilished  by  the  followpg  scheme.  First,  an 
average  D .  is  ccanputed .  This  average  value  is  divided  into  each  Dj  value 

for  all  points.  Then,  each  site's  normalized  point  value  is  simmed  over 
all  points.  The  sites  are  then  ordered  (largest  to  smallest)  based  on  the 
magnitude  of  the  sum.  The  first  three  sites  (with  the  largest  v^ues) 
are  then  selected  for  the  first  estimate  of  meeting  a  user's  trajecto^ 
requirement.  If  the  probability  of  meeting  the  requirement  is  sufficient , 
the  computation  is  terminated.  If  the  probability  is  insufficient,  the 
site  with  the  next  largest  value  is  added  to  the  computation.  This 
procedure  is  continued  until  the  desired  probability  is  obtained  or  ^1 
the  sites  in  the  group  are  used.  This  procedure  has  resulted  in  minimizing 
•the  number  of  sites  required. 


In  evaluating  -the  procedure,  it  was  found  that  -the  sites  selected 
produce  the  naximum  P(R<jiit)  95%  of  the  time;  and  for  the  remaining  5%, 
the  P(R<}nt)  was  within  3%  of  the  maximum. 
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6.  CONCLUSIONS.  The  models  discussed  in  this  paper  can  be  used  for 
analyzing  cost-to-support  trade-offs.  Cost-to-support  is  related  directly 
to  the  type  and  amount  of  instrumentation  necessary  to  meet  a  particular 
user  requiranent.  Thus,  -Oie  output  of  the  RCP  Model  provides  the  formation 
necessary  for  risk  analysis  fron  a  measirranent  aspect.  It  is  readily  apparent 
that  the  more  stringent  the  error  requirement  or  the  less  risk  of  data  loss 
a  user  can  accept,  the  higher  the  cost-to-suppoi't . 

There  are  limiations  to  the  model.  First,  since  the  error  and 
reliability  values  used  are  based  on  history,  changing  perform^ce  will 
result  in  erroneous  answers;  furldier,  since  the  present  reduction  process 
is  modeled  in  the  equations,  a  change  in  the  procedure  will  necessitate 
revision  of  the  model. 
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NOK-RAiroOMIZED  CLINICAL  TRIALS 


E.  A.  Gehan  and  E.  J.  Freireich 
The  University  of  Texas  System  Cancer  Center 
Houston,  Texas 

ABSTRACT 

This  paper  gives  a  general  discussion  of  some  principles  involved  in 
planning  comparative  studies,  namely,  the  objectives,  comparability  of 
patients,  feasibility,  and  ethics.  For  each  principle,  circumstances  are 
given  for  which  a  non-randomized  study  is  to  be  preferred  to  a  randomized 
one.  Examples  of  non-randomized,  controlled  studies  are  presented  utilizing 
literature  controls,  an  acute  leukemia  late  intensification  study  involving 
matched  controls ,  and  an  acute  leukemia  sequence  of  three  studies .  In  ^he 
latter  example,  adjustment  for  prognostic  factors  was  carried  out  to  enable 
the  studies  to  be  compared  with  respect  to  response  rate  and  survival . 
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'  ra-RMDOMIZED  CLINICAL  TRIALS 

E.A.  Gehan 
and 

E.J  Freireich 

The  University  of  Texas  System  Cancer  Center 

1.  Introduction 

Consider  the  design  of  the  following  Army  experiment  (hypothetical). 
Because  of  the  need  for  saving  money,  an  officer  in  the  Quartermaster  Corps 
does  a  study  of  shoe  sizes  for  Army  recruits.  He  finds  that  the  distribution 
of  shoe  sizes  has  several  peaks  and  that  it  would  be  possible  to  save  money  in 
buying  shoes  by  ordering  only  a  small  number  of  sizes.  He  decides  that  the 
best  way  to  determine  which  sizes  to  buy  is  from  a  randomized  comparative  study. 
His  idea  is  to  issue  three  sizes  of  shoes:  8%,  9h  and  10^  randomly  to  incoming 
recruits  and  their  "response”  to  a  particular  shoe  will  be  measured  following 
a  ten  mile  hike  by  interviewing  and  a  physician's  examination.  The  ultimate 
objective  is  to  choose  a  single  size  of  shoe  for  all  recruits.  What  is  wrong 
with  this  experiment?  The  objective  is  stated  clearly,  the  designed  experiment 
could  be  carried  out,  treatments  would  be  assigned  at  random  and  there  wouldn't 
be  much  difficulty  in  measuring  reaction  of  the  recruits  to  the  assigned  shoes. 
It  is  obvious  that  the  whole  experiment  is  ridiculous  because  each  individual 
has  his  own  shoe  size  and  a  choice  of  shoes  should  be  made  accordingly.  Random¬ 
ization,  in  this  case,  added  only  a  pseudo-scientific  aspect  to  the  experiment. 
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mu  uutLuiiiu  Luuiu  UL  piLUiLLCU  wuil  JllU  tl  greut  (16^1  6t  suttenng  would  be 
caused  among  the  Army  recruits  selected  for  the  study  -  either  by  randomization 
or  otherwise.  In  clinical  research,  treatment  must  often  be  tailored  to  the 
individual  patient  either  in  terms  of  dosage  or  schedule  and  a  randomized  com¬ 
parative  study  is  difficult  to  accomplish  when  treatment  is  individualized. 

Too  often,  randomized  comparative  clinical  trials  are  analogous  to  the  hypo¬ 
thetical  Quartermaster  who  proposed  a  randomized  comparison  of  shoes  of  different 
sizes. 

In  cancer  clinical  trials  and  in  other  disease  entities,  the  patient  is 
in  a  life  or  death  struggle  against  his  disease.  His  objective  is  to  win  the 
battle  and  he  clearly  would  like  to  be  in  the  hands  of  a  physician  who  would  give 
him  the  best  chance  of  winning.  Would  the  best  chance  be  as  a  patient  in  a  ran¬ 
domized  comparative  study  or  as  an  individual  receiving  bare  from  an  outstanding 
physician  who  used  his  best  knowledge  of  patient,  disease  and  treatment  to  choose 
a  treatment  plan?  An  analogy  might  be  the  selection  of  a  designer  for  a  car  to 
win  the  Indianapolis  500  mile  race.  Would  a  designer  be  chosen  who  did  a  random¬ 
ized  comparative  study  of  every  design  feature  to  be  added  to  the  car  or  would 
one  choose  an  experienced  designer  with  a  good  record  and  ask  him  to  use  his  best 
judgment  to  design  a  car  to  win  the  race.  Not  many  individuals  would  do  random¬ 
ized  comparative  studies  in  an  attempt  to  win  the  Indianapolis  500;  why  then  the 
emphasis  on  randomized  comparative  studies  to  win  the  battle  against  cancer  or 
heart  disease? 

In  this  paper,  a  discussion  will  be  given  to  the  general  considerations 
involved  in  planning  a  randomized  vs.  non-randomized  comparative  study  and  some 
specific  examples  of  successful  non-randomized  studies  will  be  given.  These 
studies  involve  selection  of  control  patients  from  the  literature,  from  matched 
patients  and  from  the  previous  study  in  a  sequence  of  clinical  studies.  Recent 
papers  stressing  the  value  of  non-randomized  studies  are  by  Gehan  and  Freireich 
(1974)  and  Freireich  and  Gehan  (1974). 
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2.  General  Considerations 

Four  aspects  of  the  comparative  clinical  trial  will  be  considered.  These 
are:  (a)  objectives;  (b)  comparability  of  patients;  (c)  feasibility;  and  (d) 
ethics. 

(a)  Objectives 

Chalmers,  Block  and  Lee  (1972)  have  published  a  paper  on  controlled  clin¬ 
ical  trials  in  which  the  main  theme  is  illustrated  by  a  humorous  conversation 
between  two  biostatisticians.  First  biostatistician,  **How*s  your  wife?”.  Second 
biostatistician,  "Compared  to  whom?”.  The  humor  of  this  parable  emphasizes  two 
important  and  distinctive  facts  about  the  man's  wife:  the  first  being  how  does 
his  wife  differ  from  other  wives,  a  comparative  fact;  the  second,  how  is  his 
wife  in  his  own  judgment,  that  is,  what  is  his  estimation  of  his  wife's  capabil¬ 
ities.  This  fundamental  difference  is  frequently  overlooked  in  the  design  and 
conduct  of  a  clinical  study.  It  should  be  emphasized  that  an  important  result  of 
a  therapeutic  investigation  is  the  measurement  in  a  quantitative  sense  of  the 
effectiveness  of  a  given  treatment.  There  are  situations  in  which  the  important 
question  is  not  how  effective  is  this  treatment,  but  is  this  treatment  more  or 
less  effective  than  a  standard  or  some  other  form  of  treatment.  In  general,  the 
latter  question  is  not  as  significant  as  the  former  -  for  both  treatments  and 
wives . 

An  essential  ingredient  of  clinical  research  is  a  significant  objective. 
Too  often  the  concept  of  randomization  is  equated  with  the  concept  of  research 
while  non- randomization  is  equated  with  "non-scientific”  or  "uncontrolled”.  One 
cannot  replace  the  intelligent,  imaginative,  creative  work  of  a  clinical  scientist 
with  the  routine  application  of  a  clinical  trial  technique.  In  cancer  research, 
there  are  many  examples  of  non-randomized  studies  that  have  led  to  important  altei 
ations  in  methods  of  treating  patients.  Examples  are  the  discovery  of  mechlore- 
thamine  in  the  treatment  of  Hodgkin's  disease,  the  first  antimetabolite  methotrexe 
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n  the  treatment  of  patients  with  acute  leukemia,  vincristine  in  acute  leukemia, 
nd  combination  chemotherapy  in  lymphoma  and  Hodgkin’s  disease.  These  were  all 
ramatic  advances  in  the  treatment  of  patients  with  malignant  disease  and  this 
nowledge  was  derived  from  non-randomized  clinical  studies.  What  new  and  effec- 
ive  treatments  have  been  discovered  utilizing  randomized  clinical  studies? 

(b)  Comparability  of  patients 

As  A.B.  Hill  (1962)  has  put  it,  a  Sine  qua  non  in  the  proper  conduct  of 
controlled  clinical  trial  is  having  comparable  groups  of  patients.  A  clinical 
rial  designed  to  evaluate  the  relative  effectiveness  of  two  or  more  treatments 
hould  be  planned  so  that  the  only  differences  among  treatment  groups  are  in  the 
ctual  treatment  received.  This  requires  comparability  of  patients  as  they  are 
itered  into  study,  managed  when  on  study,  and  analyzed  when  the  study  is  completed. 

The  entry  of  patients  will  be  discussed  here  and  one  technique  for  achiev- 
ig  comparability  of  patients  is  randomization,  possibly  stratified  so  that  there 
re  separate  randomizations  of  patients  in  prognostic  categories.  Even  the  pro- 
>nents  of  randomization  agree  that  randomization  guarantees  comparability  of 
itients  on  the  average  and  this  needs  to  be  checked  in  every  clinical  trial.  It 
ly  even  be  argued  that  randomization  is  a  guarantee  of  non- comparability  of  treat- 
snt  groups  with  respect  to  some  patient  characteristics,  if  enough  patient  char- 
:teristics  are  examined.  For  example,  if  there  were  a  5%  chance  that  the  random 
jsignment  of  patients  would  lead  to  a  significant  difference  between  treatment 
'oups  with  respect  to  a  given  patient  characteristic  and  the  distribution  of  20 
laracteristics  were  considered,  it  would  be  expected  that  there  would  be  a  sig- 
.ficant  imbalance  between  groups  with  respect  to  at  least  one  characteristic. 

;  Daniel  (1970)  has  pointed  out,  "Randomization  is  a  confession  of  ignorance, 
ill  randomization  is  a  confession  of  full  ignorance."  In  other  words,  a  full 
indomization  should  be  accomplished  only  when  a  clinical  investigator  is  not 
)gnizant  of  any  patient  characteristics  that  influence  prognosis. 
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Another  technique  for  achieving  comparability  of  patients  at  time  of 
entry  into  study  is  to  select  patients  for  a  control  group  according  to  certain 
characteristics,  namely  those  which  are  known  to  influence  prognosis.  If  treat¬ 
ment  A  is  the  treatment  under  study  and  treatment  B  is  a  standard  or  ^control” 
treatment  which  is  to  be  compared  with  A,  the  control  group  of  B  patients  could 
be  selected  from  the  literature,  chosen  on  a  matched  basis  from  previously  or 
concurrently  conducted  clinical  studies,  or  selected  from  the  previous  study  in 
a  sequence.  The  primary  assumption  needed  for  selecting  a  control  group  is  that 
the  important  patient  characteristics  related  to  prognosis  are  known,  so  that 
there  is  a  firm  basis  for  selecting  a  comparable  group  of  patients.  Further,  it 
must  be  assumed  that  differences  which  do  exist  between  the  groups  selected  (such 
as  time,  institution,  physician,  or  the  availability  of  supportive  care)  have  little 
or  no  relation  to  the  outcome  of  the  treatment.  In  a  disease  which  has  been 
studied  extensively,  techniques  of  regression  analysis  can  be  used  to  determine 
patient  characteristics  related  to  prognosis.  See  Armitage  and  Gehan  (1974)  for 
a  review  of  available  methods.  Some  examples  will  be  ^  ''cussed  in  section  3. 

(c)  Feasibility 

In  general,  the  feasibility  of  a  particular  study  relates  to  the  number 
of  patients  required  md  its  duration.  For  a  particular  investigator  or  group 
of  clinical  investigators,  one  can  compare  the  strategy  of  proceeding  from  one 
fairly  large  study  to  the  next,  each  based  on  a  single  treatment  vs.  the  strategy 
of  randomizing  between  two  treatments  in  each  study.  Suppose  the  investigators 
in  both  circumstances  has  exactly  the  same  requirements  concerning  the  number  of 
patients  to  be  studied  on  each  treatment.  Suppose  the  number  required  for  each 
treatment  is  N  and  the  group  of  investigators  accrues  this  number  of  patients  in 
one  year.  Assuming  that  no  follow-up  period  is  required  for  observing  the  effect 
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of  treatment,  the  strategy  of  proceeding  sequentially  from  one  study  to  the 
next  means  that  one  year  will  be  required  for  each  study.  The  investigator  who 
always  randomizes  between  two  treatments  requires  two  years  to  complete  each 
study.  It  is  true  that  at  the  end  of  two  years,  an  investigator  following  either 
strategy  will  have  evaluated  two  treatments,  however  the  investigator  who  does 
sequential  studies  will  have  an  opportunity  to  choose  a  second  treatment  based 
upon  the  results  of  the  first.  Further,  some  investigators  adopt  the  practice 
of  always  carrying  along  the  best  treatment  from  a  previous  study  in  the  current 
study;  this  results  in  evaluating  three  treatments  every  four  years  compared 
with  four  treatments  for  the  investigator  who  proceeds  sequentially.  The  latter 
investigator  will  have  had  the  opportunity  to  build  upon  knowledge  gained  from 
previous  studies  to  choose  three  treatments,  while  the  investigator  preferring 
simultaneous  comparisons  will  have  chosen  only  one  new  treatment  based  upon  the 
results  of  a  previous  study. 

Suppose  an  investigator  is  doing  a  simultaneous  comparison  of  treatments 
A  and  B  in  which  a  fixed  number  of  patients  is  to  receive  each  treatment  so  that 
the  difference  in  response  rates  can  be  detected  at  a  given  significance  level  and 
power  of  test.  These  specifications  lead  to  n  patients  being  required  on  each 
treatment  and  tables  of  n  are  readily  available  in  textbooks  (Cochran  and  Cox, 
1957)  (Holland  and  Frei,  1973).  An  experimenter  who  does  studies  in  sequence  of 
one  treatment  might  be  prepared  to  assume  that  the  response  rate  to  the  control 
treatment  (B)  is  so  well  known  that  it  may  be  taken  as  a  fixed  quantity,  say  p, 
and  no  patients  need  receive  B  in  the  trial.  To  carry  out  a  statistical  test  of 
the  difference  between  the  proportion  of  patients  responding  to  A  and  B  at  the 
same  significance  level  and  power  assumed  above,  only  n/2  patients  are  needed  on 
treatment  A,  which  is  only  1/4  the  total  number  of  patients  required  for  the  ran¬ 
domized  comparative  trial.  When  the  cost  of  supporting  clinical  studies  is  often 
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in  excess  of  $1,000,000  per  year,  a  savings  of  patients  and  duration  of  study 
has  a  substantial  dollar  equivalent.  Even  when  the  response  rate  to  the  con¬ 
trol  treatment  is  not  known  precisely,  it  may  still  be  reasonable  to  proceed 
as  if  it  is  known.  For  example,  in  the  treatment  of  patients  with  advanced  lung 
cancer,  the  expected  percentage  of  patients  responding  to  standard  treatment  is 
very  low  (less  than  20%)  and  survival  is  poor.  In  this  circumstance,  it  would 
be  sensible  to  test  a  proposed  therapy  against  a  specified  percentage,  say  20%. 

The  objective  would  be  to  find  a  new  treatment  that  has  a  response  percentage 
significantly  higher  than  20%. 

(d)  Ethics 

All  clinical  investigators  seek  results  which  demonstrate  that  the  overall 
prognosis  for  patients  is  getting  better.  Clinical  trials  in  which  patients  do 
less  well  than  they  have  in  the  past  are  to  be  avoided  at  all  costs  and  to  be  con¬ 
cluded  as  early  as  possible.  A  comparative  clinical  trial  should  not  be  started 
unless  there  is  some  preliminary  evidence  suggesting  that  the  new  therapy  is  at 
least  as  good  and  possibly  better  than  the  standard.  If  this  is  accepted,  the 
question  can  be  raised  whether  it  is  ethical  to  enter  patients  on  the  standard 
therapy  when  there  is  little  or  no  chance  that  the  standard  could  be  better  than 
the  new  therapy.  That  is,  the  objective  should  be  to  study  the  new  therapy  until 
it  can  be  concluded  whether  the  new  therapy  is  significantly  more  effective  than 
the  standard  or  not.  Study  of  the  new  therapy  could  be  stopped  when  the  probability 
of  its  being  more  effective  than  the  standard  becomes  very  low. 

The  clinical  investigator  conducting  studies  in  sequence  of  treatments  is 
always  giving  what  he  considers  to  be  the  best  treatment  to  his  patients.  Re¬ 
cruitment  of  patients  to  a  clinic  to  receive  this  treatment  is  much  easier  than 
for  the  investigator  who  proceeds  by  simultaneous  comparisons.  The  former  inves¬ 
tigator  can  promise  all  patients,  even  those  who  come  from  long  distances,  that 
they  will  receive  what  the  investigator  thinks  is  the  current  best  treatment.  The 
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latter  type  of  investigator  can  promise  only  that  the  choice  of  treatment  will  be 
determined  essentially  by  flipping  a  com  and  that  the  treatments  in  the  clinical 
trial  are  reasonably  good  ones. 

Meier  (1975)  has  stated  the  ethical  problem  as  follows:  ’The  view  is 
often  expressed  that  each  patient  must  be  afforded  the  presumed  benefit  of  any 
estimated  advantage  of  one  treatment  over  another,  regardless  of  how  slight  or 
uncertain  that  advantage  may  be,  I  insist  that  this  view  does  not  reflect  my 
attitude  about  myself  as  a  patient,  nor  does  it  reflect  the  attitude  of  most  of 
us.  Make  no  mistake  about  it,  this  position  is  incompatible  with  any  experimenting 
whatever,  controlled  or  casual.  It  does  not  favor  judicious  experimenting  with  a 
new  technique  or  drug  on  carefully  selected  patients.  That,  after  all,  can  be  done 
in  a  controlled  study.  Rather,  it  forbids  any  experimenting  at  all.”  The  ethical 
dilemma  disappears  if  one  proceeds  sequentially  in  evaluating  treatments  -  the 
presumed  best  treatment  is  always  being  given.  However,  what  Meier  and  many  other 
statisticians  do  not  accept  is  that  conducting  studies  in  sequence  can  resolve 
the  scientific  problem  of  properly  evaluating  the  relative  effectiveness  of  treat¬ 
ments,  This  will  be  demonstrated  by  some  examples  from  cancer  clinical  trials. ^ 

3.  Examples  of  Non-Randomized  Clinical  Trials 

In  this  section,  some  examples  of  non-randomized  clinical  trials  are  given 
in  which  patients  in  the  control  group  were  selected  to  be  comparable  to  those 
receiving  a  study  treatment.  Patients  in  the  control  group  were  selected  based 
upon  their  prognostic  characteristics  and  the  assumption  was  made  in  all  studies 
that  the  patient  characteristics  chosen  accounted  for  the  major  proportioh  of  the 
patient-to-patient  variability  in  response.  Literature  controls,  matched  controls, 
and  patients  from  a  sequence  of  studies  will  be  considered  in  relation  to  the  eval¬ 
uation  of  study  treatments. 
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(a)  Literature  Controls 

In  all  circumstances  in  which  the  same  or  similar  treatments  have  been 
used  by  others  in  a  clinical  investigation,  it  is  desirable  to  use  these  patients 
as  controls,  even  when  there  is  also  an  internal  group  of  control  patients  in  the 
trial.  Unfortunately,  it  is  usually  true  that  authors  do  not  provide  sufficient 
data  in  their  papers  so  that  it  can  be  checked  whether  the  patients  reported  in 
the  literature  are  comparable  to  those  in  a  given  clinical  trial.  It  certainly 
would  be  helpful  if  authors  and  those  engaged  in  large  cooperative  group  studies 
could  make  available  basic  data  on  punch  cards  or  computer  tape  so  that  others 

might  use  the  data  for  literature  controls. 

An  example  of  a  literature  control  group  is  given  in  the  study  reported 
by  Luce  et  al  (1971)  in  which  combined  cyclophosphamide,  vincristine  (Oncovin), 
and  prednisone  therapy  (COP)  for  malignant  lymphoma  was  compared  to  single  agent 
treatment  with  cyclophosphamide  or  a  vinca  alkaloid  (vinblastine  for  Hodgkin's 
disease  and  vincristine  for  lymphosarcoma)  as  reported  by  Carbone,  Spurr,  et  al 
(1968).  All  patients  in  both  studies  had  stage  III  or  IV  disease.  However,  patients 
who  had  received  major  prior  chemotherapy  or  those  with  moderately  impaired  bone 
marrow  reserve  were  excluded  from  the  Carbone  study.  Thus,  in  terms  of  prior 
treatment  and  bone  marrow  reserve  -  two  important  prognostic  factors  -  patients  who 
had  received  little  or  no  prior  treatment  in  the  Luce  study  were  comparable  to  those 
in  the  Carbone  study.  The  age  and  sex  distributions  were  similar  in  the  studies. 
Hence,  when  adjustment  was  made  for  prior  therapy,  it  could  be  concluded  that 
patients  in  the  Carbone  study  were  comparable  to  those  in  the  Luce  study.  The 
complete  remission  rate  following  COP  treatment  was  36-50%  in  malignant  lymphoma 
compared  with  6-20%  for  the  single  agent  treatment  reported  by  Carbone.  In  addition, 
other  series  of  patients  receiving  either  single  agents  or  COP  treatment  by  a 
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slightly  different  schedule  had  similar  results.  Because  both  single  agents  and 
COP  had  response  rates  that  were  consistent  from  one  study  to  the  next  and  the 
evidence  that  COP  was  significantly  superior,  it  seemed  safe  to  conclude  that  COP 
was  superior  to  single  agent  treatment  in  the  induction  of  complete  remissions.  ^ 

Another  example  is  that  given  by  Sutow  et  al  (1970)  in  which  the  survival 
experience  of  patients  with  Wilm’s  tumor  or  neuroblastoma,  first  treated  in  1962, 
was  compared  to  that  of  patients  first  treated  in  1956.  A  total  of  35  institutions 
participated  in  the  study  and,  for  patients  with  Wilm*s  tumor,  it  was  demonstrated 
that  the  age  distribution,  percentage  of  children  with  metastases,  and  intensity 
of  surgical  and  radiation  therapy  were  comparable  between  the  two  time  periods. 
However,  94%  of  patients  received  drug  therapy  (mainly  actinomycin-D,  vincristine, 
and  cyclophosphamide)  in  1962  compared  with  28%  in  1956.  A  significant  improval 
in  survival  was  demonstrated  for  patients  of  all  ages  without  meatstases  and  for 
patients  two  years  or  older  with  metastases.  The  authors  concluded  that  the  in¬ 
creased  clinical  use  of  chemotherapeutic  agents  resulted  in  the  significant  improve¬ 
ment  in  the  survival  curves.  For  patients  with  neuroblastoma,  though  there  was 
a  slight  difference  in  the  survival  experience  for  both  non-metastatic  and  meta¬ 
static  patients  favoring  those  first  treated  in  1962,  the  difference  was  not  near 
statistical  significance  and  it  was  concluded  that  the  increased  use  of  chemo¬ 
therapeutic  agents  did  not  result  in  a  significant  improvement  in  survival  time. 

A  literature  control  group  is  useful  when  patients  can  be  checked  for 
comparability  and,  in  some  circumstances,  when  it  can  be  demonstrated  that  patients 
in  the  literature  have  more  favorable  prognostic  indicators.  Authors  should  be 
encouraged  to  have  details  of  their  data  available  to  others  for  comparison  purposes. 

(b)  Matched  Controls 

In  a  matched  control  study  in  which  patients  are  to  be  selected  from  a 
group  of  patients  treated  in  the  past,  all  new  patients  would  receive  the  treatment 
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to  be  evaluated,  say  treatment  A.  A  pairmate  for  each  patient  receiving  A  would 
be  chosen  at  random  from  among  the  possible  pairmates  in  the  group  of  historical 
control  patients  who  received  treatment  B.  The  applicability  of  this  approach 
depends  upon  having  a  sufficiently  large  group  of  patients  for  potential  pairmates. 
Patients  obtained  by  this  process  who  receive  treatment  A  would  be  as  comparable 
as  possible  to  those  on  treatment  B  with  respect  to  the  patient  characteristics 
used  as  a  basis  for  the  pairing.  If  sufficient  patients  are  available,  it  may  be 
desirable  to  select  two  control  patients  for  each  treated  patient,  making  a  com¬ 
parison  between  control  patients  to  test  the  selection  process. 

An  example  of  this  type  of  study  is  given  by  Bodey  et  al  (1976)  who  com¬ 
pared  the  length  of  complete  remission  for  patients  with  acute  leukemia  between 
two  groups:  a  study  group  receiving  late  intensification  chemotherapy  and  immuno¬ 
therapy  a  median  of  89  weeks  (range  of  58  to  194  weeks)  after  achievement  of  com¬ 
plete  remission  vs.  a  matched  control  group  of  patients  who  received  maintenance 
therapy  at  monthly  intervals,  generally  the  same  therapy  that  induced  the  remis¬ 
sion.  The  objective  of  the  late  intensification  study  was  to  cure  the  patient  by 
administering  an  intense  program  of  therapy  with  new  agents  when  the  leukemia  cell 
population  was  at  a  minimum.  Patients  were  matched  by  age  group,  cell  type,  and 
length  of  remission  prior  to  the  start  of  late  intensification  therapy.  There 
were  17  patients  in  the  matched  control  group  and  19  in  the  group  receiving  late 
intensification  therapy  (matched  controls  could  not  be  found  for  two  patients) . 

The  median  duration  of  complete  remission  subsequent  to  late  intensification  ther¬ 
apy  has  not  yet  been  reached  but  will  be  in  excess  of  98  weeks,  only  5  patients 
relapsing  of  19.  The  median  length  of  subsequent  remission  in  the  matched  control 
group  was  24  weeks  and  there  is  a  highly  significant  statistical  difference  be¬ 
tween  the  two  remission  curves  (P<.01).  Comparing  survival  times  between  groups. 
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16  of  the  19  patients  receiving  late  intensification  treatment  are  still  alive 
and  their  median  follow-up  time  is  97  weeks.  The  median  survival  time  for  patients 
in  the  matched  control  group  is  56  weeks  and  the  difference  between  curves  was 
highly  statistically  significant  (P<.01).  Thus,  this  study  has  demonstrated  the 
importance  of  a  new  concept  in  the  treatment  of 'patients  with  acute  leukemia  that 
may  have  resulted  in  a  cure  of  some  patients. 

Another  study  by  Bodey  et  al  (1971)  in  patients  with  acute  leukemia  demon¬ 
strated  that  patients  in  a  protected  environment  (PE)  receiving  prophylactic  anti¬ 
biotics  and  chemotherapy  had  significantly  better  length  of  complete  remission 
(median  of  55  weeks  for  PE,  26  weeks  for  controls) ,  length  of  suryival  (median  of 
34  weeks  for  PE,  23  weeks  for  controls),  and  percentage  of  days  spent  with  infec¬ 
tion  as  related  to  neutrophil  count  than  a  matched  control  group  of  patients 
treated  outside  a  protected  environment. 

(c)  Controls  Selected  from  a  Sequence  of  Studies 

There  are  many  cooperative  groups  engaged  in  cancer  research  in  the  USA 
who  proceed  from  one  study  to  the  next.  Generally,  there  is  little  change  over 
short  intervals  of  time  in  institution,  type  of  patient,  criteria  for  diagnosis 
and  response,  and  availability  of  supportive  therapy.  In  this  circumstance,  it 
is  sensible  to  compare  results  from  a  previous  study  with  those  of  a  current  one. 
Using  patients  from  a  previous  study  as  controls  might  be  misleading  if  a  rela¬ 
tively  long  time  interval  had  elapsed  between  studies  (say  greater  than  3  years) 
or  if  it  could  be  demonstrated  that  important  changes  had  taken  place  with  respect 
to  clinical  investigators,  type  of  patient,  criteria  for  evaluation,  etc.  There 
are  about  25  cooperative  groups  in  the  United  States  supported  by  the  National 
Cancer  Institute  that  proceed  directly  from  one  study  to  the  next,  have  a  stable 
group  of  clinical  investigators,  see  the  same  types  of  patients  from  year  to  year, 
have  the  same  access  to  supportive  therapy  measures  and  generally  use  the  same 
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criteria  of  response  in  successive  studies.  Using  patients  from  a  previous  study 
as  controls  would  often  be  feasible  for  such  cooperative  groups. 

Examples  from  studies  conducted  by  the  Southwest  Oncology  Group  demonstrate 
that  the  same  treatment  administered  in  successive  studies  may  be  expected  to  lead 
to  the  same  general  result.  In  consecutive  studies  of  previously  untreated  pedia¬ 
tric  patients  with  acute  leukemia,  the  complete  bone  marrow  remission  rates  for 
patients  treated  with  vincristine  plus  prednisone  were  83%  (72/87)  in  the  ALinC  #6 
study  and  86%  (237/276)  in  the  ALinC  #7  study  (Lonsdale  et  al,  1975).  In  consecu¬ 
tive  studies  of  patients  with  Hodgkin’s  disease,  the  complete  remission  rate  fol¬ 
lowing  MOPP  treatment  has  remained  very  close  to  80%  for  previously  untreated 
patients  with  stage  III  or  IV  disease. 

When  consecutive  studies  of  different  treatments  have  been  conducted,  re¬ 
gression  models  can  be  utilized  to  test  whether  there  are  significant  treatment 
differences,  adjusting  for  values  of  the  prognostic  characteristics  in  the  succes¬ 
sive  studies.  If  response  is  the  end  point  for  analysis,  stepwise  logistic  re¬ 
gression  procedures  can  be  carried  out  to  interpret  the  data  (Cox(1970),  Lee  (1974)) 
If  survival  or  length  of  response  is  the  end  point,  Cox’s  regression  model  (Cox  (197 
may  be  used.  An  example  will  be  given  from  successive  studies  conducted  in  the 
Southwest  Oncology  Group. 

Over  the  past  several  years,  the  Southwest  Oncology  Group  (SWOG)  has  con¬ 
ducted  the  following  clinical  studies  in  patients  with  adult  acute  leukemia:  COAP 
vs.  OAP  vs.  DOAP  (from  2/71  to  10/72);  a  10-day  OAP  study  (from  6/73  to  1/75); 
and  a  CIAL  study  (from  1/75  to  present) .  The  designations  of  the  drugs  are  as 
follows:  C=Cyclophosph amide,  0=Vincristine  (Oncovin),  A=Cytosine  Arabinoside,  and 
P=:Prednisone.  The  CIAL  study  in  the  remission  induction  phase  consisted  of  givir 
vincristine  plus  prednisone  to  all  patients  with  less  than  30,000  blasts  in  the 
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peripheral  blood;  For  patients  with  30,000  or  more  blasts,  patients  were  random¬ 
ized  between  sequential  vs.  simultaneous  adriamycin-OAP  treatment.  In  the  first 
>tudy,  OAP  was  given  by  continuous  infusion  over  a  period  of  five  days. 

The  complete  remission  rate  for  5-day  OAP  was  43%  (39/90),  that  for  10- 
iay  OAP  was  53%  (92/173),  and  the  current  complete  response  rate  for  patients  in 
the  combined  groups  on  CIAL  is  60%  (70/117).  The  question  arises,  do  these  data 
Indicate  significantly  improved  complete  remission  rates  by  study,  or  is  there 
evidence  that  the  types  of  patients  on  the  three  studies  might  explain  the  dif¬ 
ferences  in  complete  remission  rates? 

From  previous  studies  in  adult  acute  leukemia,  the  following  patient  char- 
icteristics  have  been  identified  as  being  predictive  of  response:  age  (years). 
Infection  status  at  start  of  study  (0=no,  l=yes),  acute  myelocytic  leukemia  (0=no, 
l=yes),  hemoglobin  value  (gms  %) ,  and  logarithm  (white  blood  count).  These  five 
patient  characteristics  and  two  variables  representing  the  linear  and  quadratic 
pffect  of  treatments  were  included  in  a  logistic  regression  equation.  The  regr^s- 
>ion  equation  obtained  is  as  follows: 


+  .1276  -  .0417(Age-44.73)  +  .5027(Treat. linear- . 101) 


-  .7000(Infection  status-. 388)  -  , 3806 (AML-. 830)  ' 

+  .0501 (Hemoglobin-9. 21)  -  .0597(log(WBC)-4.144) 

+  .0207 (Treat. quadratic+. 407) 

vhere  p^  is  the  predicted  complete  remission  rate  based  on  the  7  patient  characteristics. 

The  coefficients  in  the  equation  were  determined  by  stepwise  logistic 

regression  (Lee(1974))  so  the  significance  level  of  each  entering  characteristic 

can  be  calculated.  The  statistical  significance  level  of  each  entering  variable 

rtras:  age  (P<.01),  treatment  linear  (P<.01),  infection  status  (P<.01),  AML  (P=.18), 

♦ 

tiemoglobin  (P=.33),  log  WBC  (P=.76)  and  treatment  quadratic  {P=.80).  This  analysis 
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demonstrates  that  there  is  statistically  significant  evidence  of  a  linear  increas¬ 
ing  trend  in  response  rate  by  study  and  that  age  and  infection  status  are  signi¬ 
ficantly  related  to  response  rate. 

Evidence  that  the  five  patient  characteristics  do  predict  complete  remis¬ 
sion  rate  is  given  in  Table  1.  A  logistic  regression  equation  was  fit  to  the  five 
patient  characteristics  in  the  5  and  10-day  OAP  studies  (excluding  treatment  as  a 
possible  characteristic).  This  equation  is  as  follows: 

=  .02888  -  .04238(Age-. 44031) 

-  .59297(Infection  status-. 37)  -  .35854 (AML- .872) 

-  .01431 (Hemoglobin-9. 155)  -  .0208(log(WBC)-4. 127) . 

Table  1  gives  the  observed  and  predicted  numbers  of  patients  responding 

on  the  10-day  OAP  and  CIAL  studies.  As  would  be  expected,  the  relationship 
between  observed  and  predicted  probability  of  response  was  excellent  for  the  10- 
day  OAP,  since  the  equation  is  being  re-applied  to  the  same  data  from  which  it 
was  derived.  Note  that  there  is  also  a  good  relationship  between  observed  and 
predicted  probability  of  response  for  patients  on  the  CIAL  study.  The  observed 
percentages  responding  were  higher  than  predicted  in  patients  with  predicted  pro¬ 
babilities  under  .60  and  were  in  accord  with  predictions  for  patients  with  pre¬ 
dicted  probabilities  over  .60.  Hence,  there  is  some  evidence  that  patients  on 
the  CIAL  study  produced  higher  observed  responses  in  patients  with  relatively 
low  predicted  probabilities  of  response.  When  the  equation  was  applied  to  the 
patients  from  5-day  OAP,  the  predicted  complete  remission  rate  was  52.1%;  it 
was  50.0%  for  patients  on  10-day  OAP,  and  50.8%  for  CIAL.  Hence,  there  was 
strong  evidence  that  patients  on  all  three  studies  were  comparable  with  respect 
to  the  five  patient  characteristips . 
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Cox*s  I’egression  model  was  fit  to  the  survival  data  from  the  three 
studies  using  the  same  five  patient  characteristics  and  treatment  variables  as 
in  the  analysis  of  response.  Cox’s  model  may  be  written  as  follows: 

X(t)  =  exp  |ej(Xj-xp+  ...  +  gpCXp-Xp)  I  A^(t) 

where  X(t)  is  the  hazard  function  at  time  t,  the  6’s  are  regression  coefficients, 
the  x’s  are  patient  characteristics  potentially  related  to  survival,  the  x’s  are 
average  values,  and  X^(t)  is  an  arbitrary  hazard  function  when  all  the  x’s  are 
at  their  mean  values.  The  model  fit  to  the  survival  data  from  the  three  studies 
is  as  follows: 

=  +  .0319 (Age-44. 74)  -  . 4269 (Treat. linear- . 10) 

+  .4978(Infection  status-. 39)  +  . 1435(log(WBC) -4. 14) 

-  . 0429 (Treat . quadratic+ . 41)  -  . 0097 (Hemoglobin-9 . 21) 
+  .0006(AML-.83). 

The  model  was  fit  in  forward  stepwise  fashion  and  the  statistical  sig¬ 
nificance  of  adding  variables  at  each  step  was  as  follows:  age  (P<.01),  treatment 
linear  (P=.001),  infection  status  (P=.001),  log  (WBC)  (P=.30),  treatment  quadra¬ 
tic  (P=.39),  hemoglobin  value  (P=:.77)  and  AML  (P=.99).  Hence,  as  in  the  analysis 
of  response,  age  and  infection  status  are  the  two  characteristics  most  signifi¬ 
cantly  related  to  survival  time  and  there  is  evidence  of  a  linear  trend  which 
indicates  increasing  survival  time  by  study.  Figure  1  gives  the  survival  curves 
for  patients  on  the  three  studies.  The  median  survival  time  for  patients  receiv¬ 
ing  5-day  GAP  was  7  weeks,  that  for  patients  receiving  10-day  GAP  was  38  weeks, 
and  the  median  has  not  yet  been  reached  for  patients  on  the  CIAL  study.  There  is 
evidence  of  a  significant  advantage  in  survival  for  10-day  vs.  5-day  GAP  patients 
(P<.015)  and  nearly  significant  evidence  that  CIAL  has  superior  survival  to  10- 
day  GAP  (P=.059). 
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These  regression  analyses  have  permitted  comparison  to  be  made  among 
treatment  programs,  adjusting  for  patient  characteristics  related  to  prognosis. 
Based  upon  these  analyses,  one  could  more  confidently  assert  that  there  were 
real  differences  in  response  rate  and  survival  among  the  three  studies  because 
patient  characteristics  were  adjusted  for  in  both  analyses,  patients  were  com¬ 
parable  in  the  three  studies  with  respect  to  predicted  probability  of  complete 
remission,  and  the  same  patient  characteristics  (namely,  age  and  infection 
status)  were  significantly  related  to  response  and  survival. 


4.  Discussion 

The  point  of  view  has  been  presented  that  rational,  scientific,  and 
controlled  clinical  studies  can  be  accomplished  without  randomization.  In  some 
circumstances,  patients  that  are  comparable  in  prognosis  can  be  identified  in 
successive  studies  which  allow  comparison  between  a  group  of  patients  under  inves¬ 
tigation  and  other  groups  treated  in  the  past.  Recording  data  which  differs 
significantly  from  that  observed  in  the  past  forms  the  basis  for  new  knowledge. 
Confirmation  of  data  by  the  same  investigator  and  by  other  investigators  in  other 
institutions  provides  a  convincing  mechanism  for  generating  knowledge  which  pre¬ 
dicts  for  the  future. 

The  major  reasons  for  preferring  the  non- randomized  to  the  randomized 
study  are:  a  clinical  investigator  in  a  non-randomized  study  is  always  adminis¬ 
tering  what  he  believes  to  be  the  best  treatment  for  the  disease  under  investi¬ 
gation  so  there  is  no  ethical  dilemma,  and  non-randomized  studies  require  fewer 
patients  and  proceed  more  quickly  so  that  new  knowledge  is  gained  faster. 

Randomized  studies  are  useful  if  there  is  no  basis  for  choosing  comparable 
patients  treated  in  the  past  since  patient  characteristics  related  to  prognosis 
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are  unknown.  Also,  such  studies  could  be  considered  when  there  is  no  prelimi¬ 
nary  evidence  that  one  treatment  is  substantially  better  than  another  so  that 
the  ethical  dilemma  does  not  really  arise.  Thirdly,  previous  data  will  sometimes 
suggest  that  the  same  ti'eatment  program  be  studied  according  to  different  dosages 
or  schedules,  etc.,  and  it  is  convenient  to  have  these  treatments  in  the  same 
study.  Fourthly,  when  studies  are  to  be  conducted  over  a  very  long  term  (say, 

3-5  years  or  more)  then  patients  could  be  randomized  because  there  was  genuine 
doubt  that  the  ancillary  aspects  of  the  successive  studies  would  be  comparable. 

In  planning  any  clinical  trial,  there  is  no  substitute  for  imaginative, 
original,  and  creative  thought.  The  best  clinical  trials  are  those  that  have 
the  best  treatments  in  them,  whether  randomized  or  not.  Clinical  knowledge  will 
advance  when  there  has  been  careful  analysis  of  past  results  as  a  basis  for  the 
formulation  of  significant  hypotheses  to  be  tested  in  objective  and  scientifically 
valid  studies. 
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Table  1 


Observed  and  Predicted  Responses  from  Logistic  Regression  Equation 

on  10-Day  OAP  and  CIAL  Studies 


Predicted 
Probability 
of  Response 

Total 

No. 

Obs. 

CIAL 

Observed 

No. (PC) 
Responding 

Expected 

No. 

Responding 

Total 

No. 

Obs. 

10-Day  OAP 

Observed  Expected 

No. (PC)  No. 

Responding  Responding 

0  -  .19 

5 

0(  0) 

.805  1 

8 

0(  0) 

1.385 

.20  -  .39 

35 

16(46) 

10.075 

44 

13(30) 

13.749 

.40  -  .59 

27 

17(63) 

13.489 

50 

30(60) 

24.878 

.60  -  .79 

26 

20(77) 

18.051 

50 

37(74) 

34.946 

.80  -1.00 

13 

12(92) 

10.857 

3 

3(100) 

2.530 

Total 

106 

65(61) 

53.277 

(50.761) 

155 

83(54) 

77.488 

(49.994) 

62 


Survival  From  Treatment  Start 

Total  Fait 

90  67  o  5-DayOAP 

173  88  A  tO-DoyOAP 

117  38  •  CIAL 
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ABSTRACT .  The  Army  needs  information  about  how  well  an  individual 
can  perform  the  tasks  necessary  for  him  to  do  his  job.  This  information 
is  often  gathered  by  means  of  a  "criterion-referenced  test,"  a  test  made 
up  of  ;Ltems  directly  related  to  the  job  of  interest.  The  test  results 
can  be  used  in  two  ways.  The  first  way  is  to  sort  individuals  into  two 
groups,  one  made  up  of  those  who  can  perform  their  job  satisfactorily 
and  the  other  made  up  of  those  who  do  not  meet  minimal  job  requirements. 

A  second  use  of  the  test  results  is  to  estimate  the  "true"  capability 
of  the  examinees  to  do  the  task  being  tested.  These  two  uses  are  clearly 
related*  If  one  can  precisely  estimate  an  individual’s  capability,  then 
forming  the  two  groups  is  not  a  problem.  On  the  other  hand,  it  may  be 
possible  to  effectively  form  the  two  groups  without  getting  good  esti¬ 
mates  of  "true"  capability. 

Several  psychometric  models  are  available  for  grouping  the  indi¬ 
viduals  and/or  for  estimating  "true"  scores.  For  example,  one  may 
simply  calculate  the  proportion  of  items  correctly  answered  and  use  that 
proportion  as  an  estimate  of  "true"  capability.  Alternatively,  a  binomial 
error  model  for  deriving  the  expression  for  the  regression  of  "true"  score 
on  observed  score  can  be  used  and  a  "true"  score  calculated  for  each 
individual.  Other  possible  models  include  a  Bayesian  Model  II  approach 
and  a  latent  trait  model  such  as  the  Rasch  one  parameter  logistic  model. 
Each  of  these  models  yields  a  somewhat  different  estimate  of  "true" 
capability  for  any  given  individual.  It  follows  that  the  makeup  of  the 
job  ability  groups  will  vary  from  model  to  model.  The  purpose  of  this 
research  is  to  empirically  study  the  models  referred  to  above.  What 
is  needed  is  an  appropriate  statistic  (or  statistics)  and  research 
design  for  comparing  each  model  against  all  others  given  the  same  test 
data. 

I.  INTRODUCTION.  The  purpose  of  this  paper  is  to  elaborate  on 
some  technical  details  and  to  highlight  specific  statistical  and 
research  problems  introduced  in  a  previous  paper  by  one  of  the  authors 
(Epstein,  1975). 

Epstein  described  four  procedures  for  estimating  true  scores  from 
observed  scores.  The  first  uses  the  observed  proportion  correct  as  an 
estimate  of  the  true  proportion  correct.  This  procedure  is ‘straight¬ 
forward  and  familiar.  Hence,  discussion  of  it  will  be  reserved  until 
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the  problem  of  comparing  the  models  is  developed.  The  other  three  pro¬ 
cedures  are  1)  a  binomial  error  model,  2)  a  Bayesian  model,  and  3)  the 
Rasch  logistic  model.  Each  will  be  discussed  in  detail. 

2.  BINOMIAL  ERROR  MODEL.  The  binomial  error  model  (Lord  and 
Novick,  1968,  pp.  508-529)  is  based  on  the  assumption  that  the  condi¬ 
tional  distribution  of  observed  score  for  given  proportion  correct  true 
score  (T)  is  the  binomial  distribution, 

b(xlT)  =  (S)  T^  (1-T)^‘X 

x=0,l...n  is  the  number  of  correct  responses  observed  and  n.  is  the  total 
number  of  items  on  the  test. 

It  is  assumed  that  items  are  scored  dichotomously ,  that  total  score 
for  an  examinee  is  the  number  of  items  answered  correctly,  that  items 
are  locally  independent,  and  that  items  are  equally  diffitult  for  a 
given  examinee . 

The  relationship  between  the  observed  score  distribution  and  the 
underlying  true  score  distribution  can  be  written  as  follows: 

<t>(x)  =  (5)  /I  g(T)  TX  (l-T)n-x  dT,  x=0,l.  •• -n,  where  ((.(x)  is 
o 

the  distribution  of  observed  scores  and  g(T)  is  the  unknown  distribution 
of  true  scores. 

It  can  be  shown  that  if  the  regression  of  true  score  on  observed 
score  is  linear  then  the  distribution  of  observed  score,  symbolized  h(x) 
to  distinguish  this  special  case  from  the  general  case  (p  (x) ,  is 
negative  hypergeometric. 

~  ("‘n)x  (a)y  ^  0,1... n, 

(a+b)f^^  (-b)x 


where 


a  and  b  are  parameters  to  be  determined  and 
=  n(n-l)  . . .  (n-x+1)  , 


(a)x  =  a(a  +  1) . , .  (a  +x  -1),  n^^^  =  (a)Q  =  1. 


The  parameters,  a  and  b,  can  be  expressed  in  terras  of  moments  of  the 
observed  score  distribution 


a  =  (-l+l/a2i)  Px 
b  «  “a-l+n/a2i 


“21=  I!_ 
n-1 


1-  ^x(n-Mx) 


n  O' 


X 
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The  discussion  thus  far  has  outlined  an  internal  check  of  the 
appropriateness  of  this  model  for  any  given  data  set.  That  is,  if 
one  can  show  adequate  fit  to  the  negative  hypergeometric  distribution 
by  the  observed  scores  then  it  is  reasonable  to  continue  with  this 
model  assuming  linear  regression.  If  adequate  fit  is  not  obtained 
then  either  the  more  general  nonlinear  regression  approach  must  be  used 
or  alternative  models  must  be  identified. 

It  can  be  shown  that  if  the  observed  score  distribution  is  negative 
hypergeometric,  the  true  score  distribution  is  either  the  two  parameter 
beta  distribution,  or  some  other  distribution  having  identical  moments 
up  through  order  n.  In  either  case,  the  regression  of  true  score  on 
observed  score  is  given  by  the  linear  equation 

E  (t|x)  =  a2ix  +  (l-a2j^)Wx  ,  x  =  0,1,  .’..n  . 
n  n 

3 .  BAYE S IAN  MODEL .  The  Bayesian  model  used  to  evaluate  these  data 
is  described  by  Lewis,  Wang,  and  Novick  (1973).  The  procedure  transforms 
the  binomial  test  score  data  via  an  arc  sine  transformation.  The  re¬ 
sulting  score  is  assumed  to  be  a  sample  from  a  normal  population  with  its 
mean  value  at  the  individual’s  transformed  true  ability.  Distributions 
for  the  prior  mean  and  variance  of  the  examinee  group’s  transformed 
scores  are  specified  and  posterior  values  calculated.  Finally,  the 
posterior  marginal  distributions  for  the  transformed  scores  are  obtained 
and  estimates  of  individual  true  abilities  on  the  original  (proportion 
correct)  scale  are  calculated.  The  mathematical  details  are  outlined 
below. 

The  Freeman-Tukey  transformation  for  binomial  data  is  used  in 
this  procedure: 


number  of  correct  responses.  The  gj  are  assumed  to  be  normally  dis- 
tributed  with  mean  Yj  =  sin"!  and  variance  v  =  (4n+2)-l,  where  Yj 

is  the  transformed  value  of  the  true  proportion  of  correct  responses,  Hj. 
The  validity  of  the  assumption  of  normality  and  the  suitability  of  the 
transformation  for  the  procedures  to  follow  can  be  shown  to  be  adequate 
for  examinee  groups  of  at  least  15  persons  and  for  tests  at  least  8  items 
long. 

The  set  of  transformed  variables,  y j ,  is  assumed  to  be  a  random 
sample  from  a  normal  distribution  with  mean  yp  and  variance  (J)p  .  y^  and 

<t>p  are  further  assumed  to  be  independent  and  to  have  a  uniform  and  inverse 
chi-square  distribution  respectively.  Explicit  expressions  for  the  prior 
and  posterior  density  functions  are  given  in  the  Lewis,  et  al.  paper. 


The  desired  result  of  an  analysis  of  this  kind  is  the  marginal 
posterior  density  function  for  Yj - .  Unfortunately,  an  explicit  ex¬ 
pression  for  it  is  not  obtainable  from  the  joint  posterior  probability 
density  function  of  the  Yj  vector  given  the  gj  vector.  Lewis  et  al. 
show  methods  for  obtaining  the  marginal  means  and  variances  for  the 
Yj  using  numerical  integration.  However,  they  indicate  that  for 
large  sample  sizes,  the  conditional  posterior  distribution  of  Yj  given 
<(>P  and  the  g.  vector  provides  an  acceptable  approximation.  The  con¬ 
ditional  approximation  was  used  for  the  analysis  of  the  data  reported 
in  the  Epstein  paper. 

The  conditional  distribution  of  Yj  given  and  the  gj  vector  can 
be  shown  to  be  normal  with  mean  ^ 

E  .  g)  =  gj  +  vg, 

——  ^ 

^p  +  V 

and  variance 

var  ,  g) 


where 

j  =  l,2...m  =  the  number  of  examinees, 

g  =  the  vector  of  transformed  scores,  and 

=  the  mode  of  <^j,  given  g  . 

can  be  obtained  by  solving  the  following  equation: 

(m  +  V  +  1)  +  [(m+2v  +  3)v-E  (g.  -  g.)^ 

1  i  J 

+  [  (v  4*  2)  v^  -  2  X  v]  -  X  v^  =  0  . 

In  the  above  equation,  v  is  the  degrees  of  freedom  for  the  prior 
inverse  chi-square  distribution  of  .  Lewis,  et  al.  recommend  that 
a  value  of  eight  be  used  for  most  practical  applitations .  X  is  the 
scale  factor  for  the  inverse  chi-square  distribution.  It  can  be 
calculated  by  using  the  formula 

X  «  V  -  2 

4{t+l) 


v((j>^  +  m~^)  , 


+  V 
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where  t  Is  interpreted  as  the  number  of  test  items  that  the  prior 
information  is  considered  to  be  equivalent  to. 

Once  the  Yj  have  been  calculated,  the  last  step  in  the  procedure 
is  to  calculate  the  estimates  for  the  true  proportion  correct.  This 
is  accomplished  by  applying  the  following  equation: 

=  (1  +  1  )  sin^Y^  “  i 

2n  ^  4n 

4 .  RASCH  MODEL .  The  Rasch  one  parameter  logistic  model  (Wright  and 
Panchapakesan,  1969)  assumes  that  the  observed  response  a^^j^  of  person 
n  to  item  i  is  governed  by  a  binomial  probability  function  of  person 
ability  and  item  easiness  E^.  The  probability  of  a  correct  response  is 

P  (ani  =  1)  =  ZjjEi 

The  probability  of  a  wrong  response  is: 

P  (a^i  =  0)  =  1  -  P  (a^i  =  1)  =  1  . 


These  equations  may  be  combined  to  yield 


P  =  (Z^Ep^nl 

l+2^Ei 


If  we  let  b  *=  log  Z  and  d.  =  log  E.  , 
n  ^  n  1 


^  (%i)  =  exp  +  d^)) 

1  +  exp  (b^  +  dj_) 

The  number  of  correct  responses  to  a  given  set  of  items  is  the  only 
information  needed  to  estimate  person  ability.  All  persons  who  get  the 
same  score  will  be  estimated  to  have  the  same  ability.  Hence,  in  terms 
of  score  groups. 


P  (ani)=  exp  +  d^)) 

1  +  exp  (bj  -1“  dj^) 


where  j  =  score  of  person  n,  and  all  persons  with  a  score  j  are  esti¬ 
mated  to  have  the  same  probability  governing  their  responses  to  item  i 
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The  equations  obtained  when  the  condition  of  a  maximum  likelihood 
is  satisfied  for  the  model  described  in  the  preceding  equation  are: 


k-1 

=  Z  (rjexpCbj*  +  di*)/(l+exp{bj*+di*)) ) ,  i  = 

j 


3  = 


k 

Z 

i 


(exp(bj*  +  dj[*)/ (l+exp(bj*  +  dj^*))),  j 


1,2, . . .k-1 


where  a+i 


number  of  persons  who  get  item  i  correct 


j 


the  total  test  score,  an  ability  estimate  is 
obtained  for  each  score 


rj  =  number  of  persons  in  score  group  j . 

bj*,dj^*  =  estimates  of  bj  and 


The  method  consists  of  computing  dj^  and  bj  from  the  implicit  equations 
above.  The  equations  are  handled  as  two  independent  sets  and  solved 
accordingly. 

An  approximation  of  a  standard  error  for  item  estimates  can  be 
obtained  by  assuming  that  the  variance  of  the  item  estimate  is  due 
primarily  to  the  uncertainty  in  the  item  score  To  a  first 

approximation  this  gives: 


V(di*)  ==  (adi/3a+i)^  V(a+i) 


which  leads  to: 

V(di*)  =  1/2  (rjexp(bj*+di*)/(l+exp(bj*  +  di*))2). 

3 


The  major  contribution  to  the  error  variance  of  the  ability 
estimate  comes  from  the  variance  in  scores  produced  by  a  given  indi¬ 
vidual.  This  part  of  the  error  variance  depends  upon  the  number  of 
items  and  their  easiness  range. 

An  approximation  of  the  variance  of  the  ability  estimate  b*  is 
given  by 

V>v(b*)  =  {i/c(b*)exp(b*)}  +  {l/C^(b*)} 

•  Z  (V(dj^){exp(dj^)/(l-|-exp(di4'b*))2}2) 

where  C(b*)  =  Z  (exp(djL)/(l+exp(b*+dj[)  )^) , 
i 

V(di)  is  the  variance  of  the  item  calibration  dj[. 
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The  first  term  in  the  denominator  of  the  V*(b*)  equation  is  due  to  the 
variance  in  the  score,  and  the  second  term  is  due  to  the  imprecision 
of  item  calibration.  The  first  terra  is  always  larger  than  the  second. 

5.  DISCUSSION  OF  THE  PROBLEM.  One  characteristic  of  a  useful  model  is 
that  it  has  a  small  error  of  measurement.  That  is,  the  distribution  of 
estimated  scores  for  a  given  true  score  is  closely  clustered  around  the 
true  score.  The  extent  of  the  measurement  error  that  can  be  expected 
with  a  given  model  is  dependent  on  the  variance  of  the  estimated  true 
score.  For  example,  in  the  proportion  correct  model,  the  variance  of 
the  estimated  true  proportion  correct  is  equal  to  p(l-p)/n.  In  this 
case  the  variance  of  the  estimate  will  decrease  as  the  number  of  obser¬ 
vations  increases.  Thus  it  would  seem  that  any  level  of  precision  could 
be  obtained  by  simply  adding  observations.  Urjfortunately ,  for  the  number 
of  items  that  are  usually  practical  on  a  test,  the  level  of  precision 
possible  is  not  completely  satisfactory.  It  would  be  useful  to  compare 
the  variance  of  the  true  score  estimates  obtained  with  the  other  models 
to  the  proportion  correct  model. 


Therefore  the  question  of  how  to  derive  an  expression  for  the 

variance  of  the  estimated  true  scores  for  the  other  models  must  be 

addressed.  An  expression  for  the  binomial  error  model  has  been  derived. 

Since  the  binomial  error  model  results  in  a  regression  equation  it  seems 

reasonable  to  base  the  derivation  on  the  general  form  of  the  error  of 

estimation,  o  2  /“  o  The  ratio  of  the  variance  of  true 

a  =  a  vl  -  . 


scores  to  the  variance  of  observed  scores  equals  the  reliability  co¬ 
efficient,  where  a  is  the  variance  of  the  true  number 

c  =  o>2i  >  ^ 

a2 

X 

correct.  Since  the  true  number  correct  equals  the  true  proportion 

correct  times  the  number  of  items,  C  =  nT,  one  may  write  a2  _2  a2  , 

c  T 

Substituting,  a2i/n^  .  The  reliability  of  a  test  equals 

the  square  of  the  correlation  between  true  and  observed  scores,  a2i  = 
Hence,  the  variance  of  the  estimated  true  score  can  be  written 


02 
,  E 


gj  ”21  (1  -  «21) 

n2 


For  the  Bayesian  and  Rasch  models  expressions  for  the  variances 
of  the  estimated  true  scores  were  not  derived.  In  the  case  of  the 
Bayesian  model  the  output  is  in  terms  of  the  arc  sine  of  the  true  pro¬ 
portion  correct.  While  the  sampling  distribution  of  the  transformed 
variable  is  known,  the  variance  of  the  estimated  true  proportion  correct 
Itself  was  not  determined.  A  similar  problem  exists  for  the  Rasch  model. 
The  sampling  distributions  of  the  ability  and  item  difficulty  indices 
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are  known  as  well  as  the  explicit  equation  for  calculating  the  proportion 
correct  from  those  values*  But  an  expression  for  the  estimated  true  pro¬ 
portion  correct  has  not  been  derived.  In  short,  the  problems  are: 

(1)  For  the  Bayesian  model,  given  the  variance  of  aj  and  the  equation 

ilj  =  (1  +  l/2n)  sin2  -  l/4n,  what  is  the  variance  of  ;  and 

(2)  For  the  Rasch  model  given  the  variances  of  b*  and  d*  and  the  equation 

p  (correct)  =  exp(b^  +  d^) _  what  is  the  variance  of  p? 

1  +  exp  (b*  +  d*)  , 

As  a  result  of  the  discussion  during  the  session  a  solution  to  the 
above  mathematical  problems  seems  to  be  available.  It  was  pointed  out 
that  methods  exist  for  deriving  standard  errors  of  functions  of  random 
variables.  One  promising  approach  outlined  in  Kendall  and  Stuart  (1969, 
p.  231)  involves  evaluating  terms  of  a  Taylor  expansion.  Using  the 
Kendall  and  Stuart  procedure  it  should  be  possible  to  derive  expressions 
for  the  standard  error  of  measurement  for  each  of  the  models.  This  will 
allow  for  formal  comparison  of  the  models  without  real  or  simulated  data. 

The  discussion  then  considered  whether  it  was  possible  to  compare 
the  models  by  obtaining  an  estimate  of  *’true  score"  and  comparing  it  to 
the  "real"  true  score.  The  problem  lies  in  obtaining  an  acceptable 

score.  Three  approaches  were  considered  and  are  expected  to  pro¬ 
vide  a  basis  for  future  research.  The  first  is  to  base  model  compari¬ 
sons  on  Monte  Carlo  simulation  studies.  Monte  Carlo  studies  provide 
an  unambiguous  true  score  but  suffer  from  their  lack  of  generalizability 
to  practical  applications.  A  second  approach  is  to  define  true  score 
as  the  score  obtained  on  an  instrument  consisting  of  a  large  number  of 
items.  The  models  would  then  be  used  to  estimate  the  true  score  using 
a  smaller  and  more  realistic  number  of  items.  This  approach  is  em¬ 
pirical  and  more  directly  oriented  to  practical  applications  where 
testing  time  and  the  number  of  items  that  may  be  included  in  an  instru¬ 
ment  are  limited.  Although  this  approach  suffers  from  the  fact  that 
the  defined  true  score  is  not  error  free,  the  amount  of  error  is  not 
likely  to  be  significant  for  practical  purposes.  The  third  approach 
would  investigate  the  possibility  of  applying  Geisser’s  predictive 
sample  reuse  method  (Geisser,  1975)  to  the  comparison  of  the  models. 
Geisser’s  method  may  provide  a  more  formal  empirical  approach  to 
model  comparison  than  the  second  approach  discussed  above,  however, 
it  has  not  beeii*  determined  whether  or  not  it  is  applicable  to  this 
research. 


Four  models  for  estimating  true  scores  were  presented  and 
methods  for  comparing  their  outputs  were  discussed.  Procedures  for 
comparing  the  statistical  properties  of  the  models  are  available  and 
relatively  straightforward.  Future  research  will  be  concerned  with 
establishing  the  empirical  validity  of  the  models  and  their  applica¬ 
bility  to  solving  practical  measurement  problems. 
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NON-RANDOMIZED  FACTORIAL  DESIGNS  CHARACTERIZED  BY  TREND 
ELIMINATION  AND  A  MINIMUM  NUMBER  OF  FACTOR  LEVEL  CHANGES 

Les  Lancaster  and  Steve  Re3n[iolds 
U^S.  Army  Operational  Test  and  Evaluation  Agency 
Falls  Church,  Virginia 

ABSTRACT.  An  admissible  set.  of  run  orders  is  developed  for  iP 
factorial  designs  restricted  to  trend  elimination.  The  best  design  is  then 
selected  from  this  admissible  set  having  the  minimum  number  of  factor  level 
changes.  The  procedure  is  developed  for  p=5  where  admissible  sets  are  gen¬ 
erated  between  various  mixtures  of  linear,  quadratic,  and  cubic  trend 
elimination  and  main  effects,  first  order  interactions,  and  second  order 
interactions.  The  number  of  factor  level  changes  is  used  to  generate  the 
admissible  set. 

1.  INTRODUCTION.  The  design  of  two-level  factorial  experiments  robust 
against  time  trends  will  be  illustrated  in  this  paper.  In  fact  designs  with 
zero  time  trends  will  be  displayed  that  also  keep  the  number  of  factor  level 
changes  form  run  to  run  small.  Both  of  these  features  are  essential  in 
operational  testing  due  to  resource  problems.  Operational  cost  effectiveness 
is  achieved  by  minimizing  the  number  of  factor  level  changes.  Soldier  learn¬ 
ing  and  selection  is  controlled  by  an  elimination  of  time  trends  in  the 
experimental  designs.  Thus,  these  designs  are  characterized  by  specifying 
the  run  orders  prior  to  running  the  tests.  A  combinatorial  technique  is 
developed  for  generating  these  desirable  designs. 

In  the  planning  of  an  experiment  costs  can  be  reduced  by  a  multi-phase 
design.  The  first  phase  would  be  the  design  of  all  controllable  factors  at 
their  low  and  high  levels.  Additional  phases  would  be  adaptive.  That  is, 
the  results  of  the  first  phase  would  be  decisive  for  determining  the  design 
for  the  additional  phases.  Thus,  forcing  the  complex  overall  design  to  be 
developed  in  the  real  time  m.ode.  However,  the  possible  options  at  each 
phase  are  planned  and  designed  a  priori  and  the  results  of  the  previous 
phase  trigger  the  design  decisions  for  the  next  phase.  This  report  will  be 
concerned  with  the  first  phase  where  p  factors  are  varied,  each  at  tX'7o  levels 

A  method  for  the  selection  of  run  orders  spaced  at  equal  time  intervals 
is  developed  whereby  a  subset  of  possible  or  admissible  run  order  choices 
is  restricted  to  trend  elimination.  The  designer  then  has  the  option  to 
randomize  on  this  admissible  set  or  else  he  can  select  the  run  order  with  a 
minimum  number  of  factor  level  changes.  With  respect  to  trend  elimination 
Figure  1  summarizes  seven  admissible  subsets  which  will  be  studied  in  Chapter 
However,  cases  two  and  three  admit  empty  sets  and  are  included  for  academic 
purposes. 
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FIGURE  !•  Cases  to  be  Considered 


Case 

Number 


Main 

Effects 


Highest  Restriction  On 


1st  Order 
Interactions 


L 

Q 

L 

L 

L 


2nd  Order  Inter 
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In  Figure  1  the  following  notation  is  used: 

L  ==  Linear 

Q  ~  Quadratic  and  linear 

C  “  Cubic >  quadratic*  and  linear 

The  different  cases  can  be  expressed  in  vector  notation  by  x^riting  each 
case  as  (i*  j >  k) •  For  example,  case  5  can  be  expressed  as  (Q,  L  -  ). 
Utilizing  this  notation,  the  coordinate  denotes  where  the  restriction  is  to 
be  placed  and  the  coordinate  value  deontes  the  type  of  restriction.  This  will 
become  clearer  in  Chapter  6, 

The  options  left  to  the  test  designer  for  each  of  the  cases  are  very 
flexible.  In  certain  situations  the  choice  for  a  run  order  may  be  dictated 
by  other  criteria  such  as  engineering  judgement  with  respect  to  some 
of  the  factor  interactions.  For  example,  some  of  the  factor  interactions 
or  treatment  combinations  may  be  null  or  of  no  importance  to  the  experimenter. 
For  these  situations  the  chosen  run  order  can  have  a  smaller  number  of  factor 
level  changes  as  a  tradeoff  for  a  higher  time  trend  for  the  null  treatment 
combinations. 

The  developed  method  is  an  alternative  to  full  randomization.  Some 
epeperimenters  often  use  blocks  to  gain  sensitivity  at  the  expense  of  full 
randomization  by  reducing  time  trends  to  an  average  variation  within  blocks. 
However,  if  the  blocks  contain  many  runs,  then  the  average  trend  within  a 
block  may  still  cause  a  disturbing  effect.  In  the  developed  method  random- 
izatioti  is  restricted  to  the  admissible  set  of  runs  whereby  a  price  tag  can 
even  be  attached  to  each  ordered  sequence  of  runs  in  the  admissible  set. 
Selection  is  then  based  on  the  set  with  the  total  number  of  factor  level 
changes  minimized.  Procedures  for  partial  randomization  with  respect  to 
equivalence  classes  is  left  as  an  option  to  the  designer. 

2.  REVIEW  OF  PERTINENT  LITEPATURE.  In  this  paper  admissible  sets  are 
restricted  to  zero  time  trends  where  the  optimal  run  order  is  chosen  which 
has  a  minimum  number  of  factor  level  changes.  Other  work  has  restricted  to 
admissible  sets  having  the  minimum  number  of  factor  level  changes  where  the 
optimal  run  order  is  chosen  which  has  a  minimum  (non-zero)  simple  or  multiple 
correlation  with  time.  In  this  paper  the  admissible  sets  have  zero  simple 
and  multiple  correlations  with  time.  Thus  far  in  the  literature  and 
including  this  paper  only  two-level  factors  have  been  studied. 

Addelman  (1)  briefly  suiumarizes  the  state-of-the-art  up  to  March  1972. 

Daniel  and  Wilcoxon  (2)  analyze  full  fractional  factorial  designs  with 
respect  to  linear  and  quadratic  time  trends.  Their  approach  is  extended 
in  this  paper*  They  do  not  consider  factor  level  changes  in  their  run 
orders. 

Draper  and  Stoneraan  (4)  were  the  first  to  consider  the  tradeoff  betVTaen 
factor  level  changes  and  linear  time  trends.  However,  they  look  mostly 
at  the  combinatorials  and  it  appears  that  they  use  search  techniques  to 
display  their  run  orders. 
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Tiahrt  and  Weeks  (6)  consider  the  selection  of  run  orders  with  respect 
to  factor  level  changes  plus  randomization  on  equivalence  classes. 

Dickinson  (3)  restricts  to  the  minimum  number  of  factor  level  changes 
and  then  selects  his  run  orders  having  minimum  simple  and  multiple 
correlations  with  linear  time  trends.  He  uses  a  computer  search  technique 
to  find  a  few  of  the  many  possible  run  orders. 

Thomas  (5)  considers  run  orders  with  the  minimum  number  of  factor  level 
changes  and  applies  the  procedure  to  sensitivity  analysis  of  parameters  in 
large  scale  deterministic  computer  models. 

3.  ^^F.THQD  OF  DESIGN  SELECTION.  The  method  will  be  illustrated  by 
application  to  a  2b  factorial  design  with  N  =  32  runs.  That^is a  full 
factorial  design.  The  extension  to  designs  with  p  >  5  will  be  obvious 
from  the  illustration » 

A  2P  factorial  design  is  characterized  by  N  =  2?  runs  of  p  factors j  with 
each  factor  at  two  levels.  For  p  =  5,  Figure  2  displays  the  design  matrix 
Qf  +  I's  (I's  are  omitted  for  ease  of  typing)  in  standard  Yates  notation 
for“the  32  runs  and  the  32  treatment  combinations  where  "T"  denotes  the 
total  treatment  combination  which  is  omitted  in  the  selection  criterion. 


The  Yates  algorithm  vrill  be  used  for  computing  polynomial  trend  of 
factors  at  two  levels.  Daniel  and  Wllcoxon  (reference  1)  have  applied  the 

Yates  algorithm  to  the  Integer  linear  and  quadratic  Tchebycheff 

orthogonal  polynomials  given  in  Figure  3.  The  Yates  solution  is  equivalent 
to  performing  the  matrix  product  between  the  design  matrix  (plus  and  minus 
ones  as  given  by  Figure  2)  and  the  polynomial  vector. ^  The  Yates  solution 
much  faster  than  the  matrix  product.  The  Daniel-Wilcoxon  procedure  is 
applied  here  where  we  extend  up  to  the  (p~2)th  order  of  the  polynomial. 
Further,  the  method  developed  in  this  paper  will  take  into  account  the 
number  of  factor  level  changes.  In  fact,  it  turns  out  that  the  number 
of  factor  level  changes  for  each  factor  characterizes  and  complements  the 
standard  Yates  design. 


In  Figure  3  only  the  first  16  numbers  are  arrayed.  The  second  set 
of  16  numbers  is  found  by  reflecting  each  column  do^aiward  and  reversing 
the  sign  for  the  linear  and  cubic  column.  For  example,  the  32nd  number 
for  each  column  will  he  -31,  155,  and  899. 

For  p  =  5  Figure  4  gives  the  Yates  solution  performed  on  the  Tchebycheff 
orthogonal  poljnomials  (Figure  3)  up  to  the  third  order.  In  Figure  4 
the  ordering  of  the  treatment  combinations  has  been  changed  from  the 
•standard  Yates  ordering  to  a  m.ore  convenient  ordering  for  the  method  to  he 
developed  in  this  paper.  It  turns  out  that, this  new  ordering  groups  the 
various  types  of  treatments  with  either  sets  of  zeros  or  sets  of  non-zeros. 
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FIGURE  2,  Standard  Yates  Notation  for  The  Design  Matrix  for  32  Runs 

TABA  CABA  DABA  CABA  EABA  CABA  DABA  GABA 

B  CCB  DDB  DCCB  EEB  ECCB  EDDB  DCCB 

.  C  D  DDC  E  EEC  EED  EDDC 

D  E  E  EED 

E  ‘ 


9  .4 - h  -+4 —  4 — ’-4'  -HH —  —44 —  4 — +  — H —  4 - H 

.10  44 — —  — — H"  44“ —  —41  — H*  44 —  —44  44 — — 

11  4 — h-  -4-4  4 — I —  -4 — h  — 1—4  4 — 1 —  —4 — ^^4  4 — I — 

10  , 1  t  t  I  f  I  t  I '  -I  I  I  I  I  t  I  t 

-  'l  l  t  r  I  I  'I  I '  — — 'I' ‘I" I  T  — —  t  1  "I  r 


13  4 - 4  4 - h  4  ■  4  4 — I-  — H —  — H—  -44 —  —44 — 

1 A  I  I  t  I  -I- -I-  .  -I  t-  .  t  I  ,  I  I  -t  -1-  -t  I 

jl*t  I  I  -  -  —  1  I  -  -  —  T  r^—  r  t  —  —  — -  I  I  — T  r  —  ft  — —  r  r 

15  'T— 1  —  4—4—  4'“^  I"—  *^h~4—  —4—4  — "I  — *!■  «— f— — f*  —4—4' 

16  +4-H-  +44+  ++++  ++++  -  -  -  - 


17  4 - h  — H —  — H —  4 h  4 (■  — H —  — H —  4 h 

18  +4 —  — H-  — 14  +4 —  44—  — H-  — 14  +4— 

19  4 — I —  -H — h  — I — h  4 — I —  4 — 4 —  —4 — h  — I — h  4 — I — 

20  4444  -  — -  4444  +444  -  -  4444 


25 

+ - f- 

-4+- 

+— + 

26 

4+— 

— — H" 

-H - 

27 

+-+- 

-+-+ 

+-+- 

28 

4-H-f 

— 

-H-H- 

29  4 - h  4 4-  4 h  4 - h  +“+  + — 4-  +-- 4  4 - 4- 

30  44—  I  I  —  4  +--—  +4 -  4+ -  4+ -  4+ - 4+ — — 

3 1  4 — i —  4 — I —  4 — I —  4 — 4 —  4 — 4 —  4 — 4 —  +-+ —  4 — I — 

32  44+4  44+4  44+4  4444  4444  4444  4444  +444 


79 


80 


81 


The  factor  level  changes  are  also  given  in  Figure  4.  Note  that  the 
number  of  factor  level  changes  vary  from  1  to  31.  The  main  effext  for  A 
has  the  maximum  number  of  factor  level  changes.  For  determining  the  number 
of  factor  level  changes  for  any  design  only  the  level  changes  for  the  main 
effects  are  summed.  Therefore,  the  standard  Yates  design  is  characterized 
by  57  factor  level  changes.  Thus,  as  the  references  show,  the  standard  Yates 
design  is  undesirable  with  respect  to  factor  level  changes.  Also,  the 
standard  Yates  design  has  large  correlations  with  time,  again  an  undesirable 
characteristic.  Thus,  optimal  designs  will  be  found  in  this  paper  having 
admissible  properties. 

The  time  counts  for  each  treatment  are  the  same  as  the  Yates  solution 
given  in  Figure  4.  Note  that  for  the  standard  Yates  design  the  main  effects 
have  zero  quadratic  time  trend.  The  first  order  treatment  interactions 
have  zero  linear  and  zero  cubic  time  trend.  The  second  order  treatment 
interactions  have  non-zero  cubic  time  trend.  The  third  order  treatment 
interactions  have  all  zero  time  trend.  These  observations  are  utilized  to 
construct  admissible  run  orders  for  the  cases  given  in  Figure  1. 

The  method  consists  of  developing  a  new  algebra  whereby  each  of  the 
31  treatments  is  denoted  by  the  number  of  factor  level  changes.  In  effect 
the  new  algebra  permutes  the  31  columns  of  Figure  2  into  an  optimal  design. 

In  the  next  section  the  development  will  be  presented  via  illustration. 

In  Chapter  6  admissible  sets  of  run  orders  for  various  eases  will 
be  constructed.  In  these  cases  whenever  the  designer  has  the  option  to 
randoniize,  it  is  to  be  understood  that  he  can  also  randomize  with  respect 
to  two  equivalence  classes. 

One  equivalence  class  is  defined  on  the  factor  names.  That  is,  the  names 
(for  example.  A,  B,  C,  D,  or  E)  can  be  chosen  at  random  for  the  admissible 
set.  there  are  p!  elements  in  this  equivalence  class. 

A  second  equivalence  class  is  defined  on  the  choice  of  the  high  and 
low  levels  for  one  or  more  factors.  That  is,  the  designer  can.  choose  the, 
plus  and  minus  slgiis  for  each  main  effect  at  random.  There  are  N  elements 
in  this  equivalence  class. 
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4r  ALGEBRA »  Multiplication  of  any  two  of  the  31  treatments  defined 
by  Figure  2  entails  pariwise  multiplication  of  the  32  elements  making 
up  each  of  the  columns  of  Figure  2.  The  classical  method  of  multi¬ 
plication  will  be  utilized,  whereby  numbers,  rather  than  letters,  will 
be  used  to  denote  the  treatment  names.  These  numbers  are  the  number 
of  factor  level  changes  for  that  particular  treatment.  That  is,  in 
Figure  4  instead  of  denoting  the  treatments  by  column  one,  column  two 
will  be  used  to  denote  the  treatments  as  assigned  by  the  standard 
Yates  notation.  As  an  example,  the  classical  multiplication  given  as 
follows : 

AC  *  ABD  =  BCD 

is  represented  in  the  new  algebra  as  follows: 

24  *  19  =  11 

Note  that  this  triplet  can  be  represented  in  three  different  ways 
as  follows: 


(i) 

24 

* 

19 

=  11 

(li) 

19 

* 

11 

=  24 

(iii) 

24 

11 

=  19 

Figure  5  displays  the  155  possible  unique  triplets  as  representation 
(iii)  in  a  two-way  table.  To  read  off  any  product  from  Figure  5, 
note  that  the  maximum  value  is  the  row,  the  minimum  value  is  the  column  ^ 
and  the  value  in  between  is  the  element  of  the  matrix  or  body  of  the 
table.  In  Figure  5  all  or  465  different  triplets  could  have  been 

displayed  by  filling  in  the  blanks.  However,  by  filling  in  only  re¬ 
presentation  (iii)  as  defined  above  a  pattern  emerges.  On  extension 
to  higher  level  designs,  this  pattern  can  be  taken  into  account  in 
developing  a  recursive  method. 
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Figure  5.  The  155  Possible  Multiplications 
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5,  SIEVE ♦  In  order  to  generate  optimal  or  admissible  designs 
the  procedure  entails  development  and  utilization  of  a  technique  which 
shall  be  called  a  sieve.  The  first  step  of  the  sieve  is  formed  by 
displaying  the  information  from  Figure  5  in  Figure  6  for  all  465  pos¬ 
sible  triplets.  In  Figure  6  each  one  of  the  31  treatments  is  determined 
by  any  one  of  the  corresponding  15  pairs.  That  is,  the  pairs  are 
choices  for  the  two  main  effects  A*  and  B*  and  the  product  is  the 
choice  for  the  treatment  AB*.  The  superscript  *  denotes  the  treat¬ 
ments  belonging  to  a  possible  candidate  for  an  optimal  or  admissible 
design.  Further,  in  Figure  6,  the  symbols  or  "C**  are 

taken  from  Figures  1  and  4  and  displayed  as  an  aid  for  sifting  out 
the  desired  restrictions  for  the  various  cases  of  Figure  1.  The 
idea  is  to  sequentially  search  down  each  of  the  31  blocks  of  Figure  6 
and  sift  out  the  desired  candidates  for  an  admissible  design.  After 
this  first  step  of  the  sieve,  the  designer  will  have  possible  candidates 
for  A*,  B*,  and  AB*. 
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The  second  step  of  the  sieve  is  concerned  with  finding  the  main 
effect  C*  given  candidates  A*,  B*,  and  AB*,  Since  the  main  effects 
can  be  relabeled  with  respect  to  equivalence  classes,  the  choice  for 
C*  can  be  subjected  to  the  following  constraint: 

A*  <  B*  < 


Now  to  choose  C*,  suppose  that  A*  and  B*  are  fixed  at  and 
"9"  respectively,  then,  for  this  example.  Figure  7  displays  28  possible 
choices  for  C*.  In  Figure  7,  for  any  choice  of  C*,  the  remaining 
three  treatments  in  that  same  row  are  automatically  determined  and 
assigned  as  shown  in  Figure  8,  for  example,  for  the  second  row  of 
Figure  7,  That  is,  the  treatments  in  each  row  of  Figure  7  for  C* 
can  be  permuted,  but  only  these  seven  rows  can  be  defined. 
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Figure  7.  Choices  For  C* 
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Figure  8.  Choices  for  AC*,  BC*,  And  ABC* 
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In  applying  the  sieve,  the  last  two  rows  of  Figure  8  can  be  crossed 
out,  for  the  example,  due  to  the  ordering  constraint  on  these  three 
candidates  for  the  main  effects.  This  ordering  constraint  will  also 
reduce  the  set  of  choices  given  in  Figure  7.  Case  restrictions  will 
further  reduce  the  set  of  choices.  Therefore,  as  the  sequential  search 
for  candidates  progresses,  or  as  A*  and  B*  increase  in  value,  the  set 
of  possible  choices  for  each  new  decreases.  Usually,  the  possiblilities 
need  not  be  exhaustive  as  shown  by  the  cases  studied  in  Chapter  6. 

At  this  stage  of  the  sieve,  for  each  possible  candidate  for  an 
admissible  design,  it  turns  out  that  seven  out  of  the  31  possible 
treatments  are  now  fixed.  The  third  step  of  the  sieve  is  concerned 
with  finding  admissible  choices  for  and  E*.  To  continue  the  sequen¬ 
tial  search,  the  ordering  constraint  is  extended  as  follows: 

A*  <  B*  <  C*  <  D*  <  E* 

Suppose  that  the  candidate  under  consideration  at  this  step  is  given 
by  the  first  row  of  Figure  8.  The  new  candidates  will  be  found  from 
the  blocks  of  Figure  6.  For  this  example,  the  best  candidate  for  D* 
is  ”13”.  Further,  on  checking  the  13th  block  of  Figure  6  and  crossing 
out  the  seven  pairs  corresponding  to  the  seven  fixed  treatments,  the 
best  candidate  for  E*  is  ”16”.  These  two  candidate  blocks  are  repeated 
from  Figure  6  as  Figure  9  but  without  any  case  restrictions.  Also  in 
Figure  9  the  seven  treatments  for  this  example  are  circled.  As  a  check 
on  the  validity  of  the  chosen  design,  note  that  in  Figure  9,  each 
block  has  seven  pairs  that  are  eliminated.  Case  restrictions  would 
eliminate  more  pairs.  Due  to  the  ordering  constraint  and  since  tlie  sum 
of  the  factor  level  changes  for  the  main  effects  is  to  be  minimized, 
only  one  pair  of  D*  and  E*  treatments  need  be  found  for  each  candidate 
up  to  this  step  of  the  sieve.  However,  the  three  main  effects  from 
step  2  will  not  have  a  sum  that  strictly  increases  or  decreases  as 
the  sequential  search  progresses. 

After  all  admissible  designs  are  sufficiently  searched  and  dis¬ 
played  the  designer  selects  the  optimal  design  with  respect  to  the 
particular  case  under  consideration.  However,  due  to  the  ordering 
criterion  and  the  fixed  choice  of  the  plus  and  minus  signs  in  Figure  2, 
the  above  selection  is  up  to  an  equivalence  class.  Therefore,  at  this 
point,  the  designer  has  the  option  to  randomize. 
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In  order  to  be  absolutely  sure  that  the  selected  design  is  a 
valid  design,  the  plus  and  minus  signs  of  the  main  effects  can  be  placed 
back  through  the  standard  Yates  notation  via  the  factor  level  changes 
as  shown  in  Figure  10,  In  Figure  10  the  design  to  be  validated  is  given 
by  the  last  row  while  the  next  to  last  row  is  the  corresponding  Yates 
notation  from  Figure  2,  Here,  a  plus  sign  denotes  a  value  of  one  and 
a  minus  sign  denotes  a  value  of  zero.  Thus,  the  Yates  count  is  deter¬ 
mined  by  vnriting  the  biliary  count  of  the  five  digit  number  of  each 
row  plus  one.  The  Yates  count  for  a  valid  design  should  include  all 
numbers  from  1  to  32. 
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6.  CASE  STUDY .  Figure  1  summarizes  seven  cases  with  various  time  trend 
restrictions.  Figure  11  shows  how  these  cases  or  sets  are  included  in 
each  other.  The  case  represented  by  (L,  -)  has  a  large  number  of 

elements  or  admissible  designs  as  well  as  the  case  with  no  restrictions. 
Therefore,  these  two  cases  will  not  be  analyzed  but  are  shown  in  Figure  11 
to  complete  the  picture.  As  more  restrictions  are  placed  on  the  design, 
or  as  more  arrows  in  Figure  11  are  traced,  the  total  number  of  factor 
level  changes  increases  and  the  trade-off  becomes  a  managerial  decision. 

Note  that  Figure  11  is  not  drawn  to  any  scale. 


FIGURE  11,  Inclusion  of  Cases 


The  logic  for  generating  the  admissible  sets  for  the  various  cases 
has  been  programmed  in  FORTRAN,  Table  look-ups,  *’IF"  statements,  and 
loops  simulate  the  sieve,  the  order  constraints,  and  the  restrictions 
and  drive  the  sequential  search, 

CASE  1,  (L,  L,  L),  For  this  case  the  5  treatments  denoted  in 

Figure  6  must  be  designated  as  third  or  fourth  order  treatments.  There¬ 
fore,  up  to  an  equivalence  class,  this  set  could  have,  at  the  most,  6 
admissible  designs.  If  4  of  the  5  possible  third  order  interactions 
(treatments)  are  fixed  then  the  fifth  one  is  determined.  Therefore,  there 
are  only  5  admissible  designs  and  these  five  designs  are  displayed  in  Figure  12, 
In  Figure  12  the  5  admissible  designs  are  generated  as  follows.  The  first 
4  treatments  are  fixed,  thus  determining  the  next  11  treatments.  The 
treatments  in  line  number  16  are  fixed  next,  thus  determining  the  rest  of 
the  treatments.  The  sum  given  in  the  last  row  characterizes  each  design 
and  is  found  by  adding  the  factor  level  changes  or  the  values  denoting 
the  5  main  effects. 
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FIGURE  12. 
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CASE 


2  ("0  0,-).  This  case  admits  an  empty  set  as  sho^m  as  follows. 


Utili.^ing  the  f  irs^  step  of  the  seive,  Figure  13  arrays  the  possible  candidates 
as  given  by  Figure  6  v/here  each  treatment  of  the  triplet  has  an  assigne 
0  or  C.  Using  the  ordering  constraint,  these  triplets  have  been  ordeie^. 
in  Figure  13.  However,  this  ordering  can  be  reversed  if  necessary.  But 
the  second  step  of  the  seive  cannot  be  filled,  since  6  of  the  7  require 
treatments  for  each  candidate  at  this  step  must  be  taken  from  Figure  13. 

Thus  admitting  an  empty  set. 

figure  13.  Candidates  for  Case  2  from  Step  1  of  the  Sieve 
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CASE  3.  (C,  L,  This  case  also  admits  an  empty  set.  This  can 

be  shown  in  a  similar  fashion  as  sho^-m  in  case  2  or  by  looking  at  the 
5  treatments  making  up  case  4  and  putting  on  the  further  restriction  on 
the  first  order  interactions.  To  repeat  the  proof  from  case  2, 

Figure  14  arrays  the  possible  candidates  from  the  first  step  of  the 
sieve.  Note  that  in  Figure  14  there  are  only  5  possible  candidates  for 
the  main  effects  and  the  following  product  violates  any  possible  designs: 

10  *  18  *  20  *  22  =  26 

That  is,  ABCD*  and  E*  must  be  different.  Thus  showing  that  the  set  foi 
case  3  is  also  empty. 
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FIGURE  14.  Candidates  for  Case  3  from  Step  1  of  The  Sieve 
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CASE  4.  (C,  .  For  this  case  Figure  15  arrays  the  possible 

candidates  from  the  first  step  of  the  sieve.  Here  there  are  only  6 
possible  candidates  for  the  main  effects,  but  one  of  these  is  inadmissible 
due  to  the  following  product  violation: 

10  *  18  *  26  -  2 

20  *  22  «  2 

This  product  violation  is  found  an  execution  of  steps  2  and  3  of  the  sieve. 
Figure  16  arrays  the  main  effects  and  the  first  order  interactions  for 
the  5  admissible  designs  for  this  case  along  with  the  sum  of  the  factor 
level  changes.  Figure  16  also  shows  that  the  set  for  case  3  is  empty,  since 
each  design  has  at  least  one  first  order  interaction  that  violates  the 
further  restriction  imposed  by  going  from  case  4  to  case  3. 
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Case  5.  (Q^  L,  -)»  This  case  admits  a  very  large  set  of  admissible 

designs.  Figure  17  displays  some  of  these  designs  which  were  generated 
in  a  fraction  of  a  second  on  the  Univac  1108  computer  along  with  the 

total  sum  of  factor  level  changes.  The  designs  with  sums  Ipss 
70  were  chosen  to  illustrate  the  possibilities. 

Figure  17.  Some  Possible  Designs  for  Case  5 
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Case  6.  (L,  L,  -).  This  case  also  admits  a  very  large  set  of 
admissible  designs,  a  set  much  larger  than  the  set  for  case  5.  Figure  18 
displays  some  of  these  designs  which  were  again  generated  in  a  fraction 
of  a  second  on  the  Univar  1108  computer.  The  designs  with  sums  less 
than  56  x<rere  chosen  to  illustrate  the  possibilities  i  The  design  with 
a  sum  of  43  is  optimal.  For  comparitive  purposes  the  standard  Yates 
design  has  a  sum  of  57  plus  non-zero  time  counts  in  the  main  effects. 
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Figure  18.  Some  Possible  Designs  for  Case  6 
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Case  (Q,  -)*  This  case  is  included  for  comparison  purposes. 

Although  it's  much  larger  than  cases  4  and  5,  it  turns  out  that  it 
has  the  same  optimal  design  as  case  5  as  given  by  the  first  design 
of  Figure  17. 

To  compare  these  cases  further,  the  optimal  design  for  the  case 
expressed  by  (L,  -)  is  given  as  (2,  4,  5,  8,  16)  with  a  sum  of  35. 

Further,  the  case  or  set  of  designs  having  no  restrictions  is  given  as 
(1,  2,  4,  8,  16)  with  a  sum  of  31  or  N-1  as  shown  by  the  references. 
However,  on  restricting  to  the  standard  Yates  notation,  as  this  paper 
has  done,  this  is  the  only  possible  design  up  to  an  equivalence  class, 
with  a  sum  of  31.  On  relaxing  the  standard  Yates  restriction,  as  the 
references  do,  many  designs  can  be  found  with  a  sum  of  31,  but  with 
non-zero  time  counts. 
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7.  APPLICATIONg.  The  application  of  the  techniques  presented  in 
this  paper  to  operational  testing  can  best  be  shown  by  giving  an  example. 

For  that  purpose,  an  experimental  design  for  an  operational  test  of  the 
hypothetical  ZAP  anti-tank  weapon  will  be  constructed. 

After  analysis  of  the  system  to  be  tested,  five  factors  are  chosen 
to  be  included  in  the  design,  each  factor  being  taken  at  two  levels, 
thus  giving  a  2^  factorial  experiment.  The  factors  chosen  and  their 
associated  levels  are  shown  in  Figure  19. 

The  importance  of  eliminating  time  trends  in  such  a  test  can  easily 
be  seen.  With  so  few  factors  being  controlled,  there  exist  the  possi¬ 
bility  that  some  uncontrolled  and  unmeasured  factor  is  Influencing  test 
results.  Such  factors  as  weather,  crew  learning,  and  crew  morale  can, 
and  usually  do,  change  with  time  through  the  test. 

Another  consideration  in  designing  this  test  is  the  ease  of  execu¬ 
tion  of  the  design.  ,  Quite  often  a  penalty  must  be  paid  in  time,  money, 
and  perhaps  test  validity  for  each  factor  level  change  which  is  made. 

For  instance,  changing  the  visabillty  factor  between  day  and  night  too 
often  would  greatly  slow  the  test  execution  and  destroy  any  attempt 
to  portray  a  realistic  combat  scenario,  as  it  would  permit  only  a  small 
number  of  firings  during  daylight  and  then  delay  further  testing  until 
night  in  order  to  achieve  the  desired  factor  level  change.  Similarly 
it  may  be  difficult  and  time  consuming  to  frequently  move  the  test  part¬ 
icipants  and  test  team  from  one  location  to  another  in  order  to  achieve 
changes  in  the  terrain  factor.  As  a  third  example,  frequent  changes 
In  the  weapon  factor  may  confuse  the  test  participant  and  prevent  him 
from  performing  as  well  as  he  might  if  he  were  allowed  to  stay  with 
one  weapon.  For  example,  one  weapon  may  require  the  soldier  to  lead 
a  moving  target  while  the  other  weapon  does  not.  If  the  test  participant 
Is  frequently  switching  back  and  forth,  he  may  forget  and  lead  when 
he  should  not  or  not  lead  when  he  should.  Even  if  he  does  remember 
and  does  the  right  thing,  he  may  not  do  it  as  proficiently  as  if  he 
had  been  able  to  concentrate  on  developing  a  single  skill  instead  of 
two . 

With  the  foregoing  constraints  in  mind,  we  can  use  the  techniques 
presented  in  this  paper  to  design  a  good  test  of  our  hypothetical  anti¬ 
tank  system. 

If  it  is  felt  desirable  to  strongly  protect  the  main  effects,  we 
could  choose  case  five  which  eliminates  linear,  and  quadratic  time  trends 
for  the  main  effects  and  linear  time  trends  for  the  first  order  interactions. 
To  construct  our  design  we  select  one  of  the  admissible  run  orders  found 
for  case  five,  as  given  in  Figure  17.  This  selection  can  either  be  made 
randomly  or  the  one  with  the  minimum  total  number  of  factor  level  changes 
can  be  chosen.  For  our  example,  let  us  choose  the  design  which  minimizes 
the  factor  level  changes.  We  can  then  •■•"instruct  our  experimental  design 
by  going  back  to  the  standard  Yates  -ocation  and  writing  out  the  level  changes 
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Figure  19.  Operational  Test 
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for  the  five  factors  as  defined  by  the  level  change  numbers  given  in 
Figure  17,  Kiis  design  is  given  in  Figure  20.  As  with  the  selection 
^5  ®  the  set  of  admissible  run  orders,  the  assignment  of 

the  five  factors  to  the  five  columns  can  be  done  either  randomly  or  by 
ordering  the  factors  based  on  which  factor  should  have  the  fewest  level 
changes  and  which  could  have  more  level  changes. 


Suppose  after  examining  Figure  20  we  feel  this  design  is  not  desirable 
because  the  number  of  factor  level  changes  for  visiblility,  weapon,  and 
terrain  are  excessive  for  the  reasons  discussed  in  paragraph  4  of  this 
chapter.  One  alternative  would  be  to  relax  the  constraints  on  the  elimina- 
tion  of  higher  order  time  trends.  We  could  decide  to  select  a  design 
which  eliminates  only  linear  time  trends  for  the  main  effects,  and  first 
order  interactions.  For  this  we  can  choose  case  six.  Figure  18  gives 
admissible  run  orders  for  case  six.  Going  through  the  same  procedure 
as  for  case  five,  we  come  up  with  the  design  given  in  Figure  21, 

Given  that  this  design  is  determined  to  be  staisfactory,  it  only 
remains  to  randomly  assign  a  plus  or  minus  to  the  actual  level  names 
for  each  factor.  For  ease  of  planning  the  conduct  of  the  test,  it  mav 
prove  convenient  to  display  the  design  information  of  Figure  21  in  a 
more  conventional  format  as  shown  in  Figure  22  where  the  number  in  each 

fv  execution  of  each  test  event  in  filling  out 

the  full  factorial  design.  e 
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Figure  20.  Case  5  Candidate  Design  for  the  ZAP  Test 
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Figure  21,  Case  6  Candidate  Design  for  the  ZAP  Test 
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8,  FUTURE  WORK.  The  computer  lo^.:  for  recursively  generating 
factorial  designs  having  more  than  five  factors  would  be  desirable. 
Admissible  designs  with  a  mix  of  two  and  three  level  factors  x^ould  be 
more  realistic.  Of  further  conc'^*^’^  woul^  be  optimal  fractional  factorial 
designs. 
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ABSTRACT .  A  method  of  estimating  error  variance  in  a 
non- replicated  experiment  by  separating  an  interaction  term 
into  sums  of  squares  of  non- additivity  and  sums  of  squares 
pertaining  to  error  was  examined.  A  sequential  procedure 
to  test  individual  degrees  of  freedom  of  the  interaction  term 
for  non- additivity  was  introduced.  Five  test  statistics  that 
could  be  applied  to  the  sequential  procedure  are  given.  The 
critical  values  needed  for  each  of  the  test  statistics  for 
a  -  0.05  and  0.15,  for  10,  20,  and  30  degrees  of  freedom  re- 
spctively  in  the  term  being  tested,  and  for  three  stages  of 
the  sequential  procedure  were  estimated  by  Monte  Carlo  methods. 

five  test  statistics  were  compared  as  to  their  power 
and  ability  to  estimate  error  variance  when  non-additive  in¬ 
dividual  sums  of  squares  were  combined  with  individual  sums 
of  squares  that  estimated  error  variance.  The  results  and 
recommendations  as  to  which  is  the  best  test  statistic  are 
2^-Vfn.  The  data  indicated  that  using  a  higher  level  of  sig- 
^^ificsnce  than  0.15  would  better  estimate  error  variance. 

.Ij — INTRODUCTION.  Frequently,  due  to  the  nature  of  an 
experiment  or  through  poor  planning,  a  design  is  formed  with¬ 
out  replication.  When  this  happens  the  experimenter  has  no 
estimate  of  experimental  error  in  his  data.  This  situation 
IS  illustrated  in  Table  1  taken  from  Fisher  (1951).  Since 
each  entry  in  this  table  represents  a  single  observation, 
there  is  no  way  to  estimate  experimental  error.  The  usual 
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solution  to  this  problem  is  to  assume  an  additive  model  (no 
interaction)  and  to  use  the  residual  sum  of  squares  as  an 
estimate  of  error.  In  a  model  with  two  main  effects  this 
means  renaming  the  two-way  interaction  as  error.  For  the 
data  in  Table  1  the  three-way  interaction  alone  may  be  pooled 
into  error  or  possibly  the  three-way  and  one  or  both  of  the 
two-way  interactions  may  be  pooled  depending  upon  the  experi¬ 
ment  and  the  analyst.  Having  an  estimate  of  the  error  the 
experimenter  may  now  be  able  to  test  other  terms  in  the  model 
that  weren't  testable  before  pooling. 

The  problem  with  this  procedure  is  that  some  of  the 
pooled  sums  of  squares  may  have  estimated  interaction  and 
not  error.  If  this  happens,  the  estimate  of  the  error  will 
be  too  large  giving  the  experimenter  a  less  sensitive  test 
of  other  terms  in  the  model. 

How,  then,  can  it  be  determined  if  the  mean  square  of 
hn  interaction  term  estimates  error,  interaction,  or  both? 
This  paper  examines  five  test  statistics  that  are  designed 
to  answer  this  question.  It  will  be  restricted  to  fixed 
models  with  one  observation  per  cell.  The  techniques  devel¬ 
oped  can  be  applied  to  any  or  all  interaction  terms  in  any 
n-way  model . 

Using  the  Modified  Abbreviated  Doolittle  (MAD)  computer 
routine  developed  by  Bryce  (1970) ,  the  terms  of  a  fixed 
model  can  be  broken  into  single  degree  of  freedom  sums  of 
squares.  These  single  degree  of  freedom  sums  of  squares 
form  the  building  blocks  of  the  five  test  statistics.  The 
individual  sums  of  squares  of  an  interaction  term  are  ranked 
and  sequentially  tested  one  at  a  time  starting  with  the 
largest  until  non-significance  is  declared.  At  this  point, 
the  significant  single  degree  of  freedom  sums  of  squares 
are  pooled  together  as  the  part  estimating  interaction  and 
the  rest  of  the  sums  of  squares  and  their  corresponding 
degrees  of  freedom  are  pooled  into  error  which  is  hopefully 
free  of  interaction. 

This  paper  will  compare  the  ability  to  find  interaction 
when  present,  or  power,  of  the  five  test  statistics  and  the 
ability  of  each  to  estimate  a^. 


119 


2.  TEST  PROCEDURE.  The  expected  mean  square  of  any 
interaction  term  can  be  broken  into  two  parts.  The  first 
part  contains  the  error  variance,  o^,  and  the  second  part 
contains  the  sum  of  the  remaining  different  possible  vari¬ 
ance  components.  The  number  of  terms  in  the  second  part 
would  depend  on  the  ANOVA  model.  If  interaction  exists, 
then  the  mean  square  of  an  interaction  term  estimates 
the  sum  of  the  two  parts  of  the  expected  mean  square;  i.e., 
a2  plus  the  rest  of  the  terms.  However,  if  interaction 
does  not  exist,  the  mean  square  estimates  only  the  error 
variance.  If  for  a  given  model  interaction  is  not  present, 
it  would  be  appropriate  to  pool  the  sums  of  squares  and 
degrees  of  freedom  associated  with  the  interaction  terms 
into  the  error  term. 

The  sum  of  squares  and  n  degrees  of  freedom  of  a  term 
in  the  model  can  be  partitioned  into  n  sums  of  squares, 
each  associated  with  one  degree  of  freedom.  If  an  inter¬ 
action  term  is  so  partitioned,  the  resulting  single  degree 
of  freedom  sums  of  squares  estimate  either  error  variance 
or  interaction.  It  would  be  desirable  to  extract  the  por¬ 
tion  that  estimates  error  only,  thus  giving  an  estimate  of 
and  making  it  possible  to  test  other  terms  in  the  model. 
This  procedure  assumes  that  some  of  the  partitioned  single 
degree  of  freedom  sums  of  squares  estimate  o^  only  and  that 
not  all  estimate  interaction. 

The  steps  for  the  proposed  sequential  procedure  for 
testing  any  interaction  term  and  estimation  of  o^  are: 

1.  Separate  the  term  with  n  degrees  of  freedom  into 
n  sums  of  squares  containing  one  degree  of  freedom  each. 

2.  Rank  the  n  sums  of  squares. 

3.  Apply  one  of  the  test  statistics  to  the  largest 
sum  of  squares. 

4.  Check  for  significance  using  the  appropriate  values 
in  the  table  for  a  and  stage.  (Stage  is  the  number  of  the 
sequential  test  that  is  being  performed  on  the  individual 
sums  of  squares  of  an  interaction  term.  For  example,  stage 
one  is  the  test  of  the  largest  individual  sum  of  squares, 
stage  two  the  second  largest  and  so  on.) 
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5.  If  significance  is  declared,  return  to  step  three 
using  the  same  test  statistic  and  significance  level  to 
test  the  next  largest  sum  of  squares.  If  no  significance 
is  found,  proceed  to  step  six. 

6.  Pool  the  significant  sums  of  squares  and  degrees 
of  freedom  into  one  interaction  term. 


7 .  Pool  the  remaining  sums  of  squares  with  their 
appropriate  degrees  of  freedom  into  error. 

3.  TEST  STATISTICS.  The  proposed  test  statistics  will 

be  labeled  FI,  F2,  F3,  F4,  and  F5  for  convenience  and  the  sum 
of  squares  of  a  single  degree  of  freedom  interaction  term  will 
be  written  as  S^  where  (Si  <  S2  <  . . •  <  S^) .  The  stage  in 
the  sequential  test  procedure  will  be  denoted  by  r  and  n  will 
denote  the  degrees  of  freedom  in  the  interaction  term  before 
testing. 

The  test  statistics  are: 

FI  =  i 
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FI  could  be  described  as  the  sums  of  squares  having 

?i1"h:VeIt‘^lT/s?ua?L‘5lviraT  he.s«U 

IS  sSm'of  squares.  F3  is  the  test  sum  of  squares  diviaed 
ly  the  totalSums  of  squares  of  the  ‘"teraction  term  F4  is 
a  composite  of  FI  and  F3  F5  is  the  "““"ator  of  F3 
by  the  sum  of  the  sums  of  squares  less  than  the  test  sum  or 

squares. 

4.  GENERATION  OF  CRITICAL  VALUE_S.  Jhe  sequential^test 
procedure  was  developed  to  test  the  hypothesis 
Ltion  present  in  the  single  degree  of  th^ 

of  any  interaction  term.  This  would  mean  that  each  of  the 
cincrle  degree  of  freedom  interaction  sum  of  squares  estima 

I  a  central  hhi-square.dlstributio„^»ith  one 

degree  of  freedom.  The  null  hypothesis  for  the  test  proce 
fure  at  the  first  stage  could  be  written 


Xi  =  X2= 


^n  = 


where  X-  represents  the  non-centrality  parameter  of  the  chi- 
saSre  assoKalS  with  each  of  the  ordered  single  degrees  of 
freedom.  If  the  test  proceeds  to  the  second  stage  the  nu 
hypothesis  would  be 


X2  = 


.  =  X, 


and  so  on  at  other  stages  of  the  test. 

Under  the  null  hypothesis  it  is  possible  to  generate 
the  critical  values  for  each  test  statistic  using  g 

of  freedom  central  chi-squares.  Two  ^h^stage 

shape  of  the  distribution  of  each  test  ’ .^^®.  ^  ® 

of  the  test  and  the  number  of  degrees  of  freedom  in  the 
fntfrlcVlon  term  under  consideration  Using  an  electronic 
computer  the  distributions  of  each  of  the  test  statisrics 
were  simulated  for  three  stages  and  interaction  terms^of  t  , 
twenty,  and  thirty  degrees  of  freedom.  ^PP®’^ 

of  the  distributions  were  ordered  and  the 

percent  points  were  found  thereby  giving  an  estimate  of  the 
0.05  and  0.15  critical  values  under  the  null  hypothesis. 

The  single  degree  of  freedom  chi-squares  were  formed 
by  generating  a  standard  normal  value  and  squaring  it.  Each 


standard  normal  was  generated  by  the  Box-Muller  (1958) 
transformation  using  uniform  values  generated  by  the  McGill 
Random  Number  Generator  Package,  supplied  by  McGill  Univer¬ 
sity.  This  method  of  generating  standard  normals  was 
found  satisfactory  by  Thomas  (1975). 

A  more  detailed  explanation  of  how  the  critical  values 
were  found  for  stage  one  and  ten  degrees  of  freedom  of 
interaction  will  now  be  given.  Ten  one-degree  of  freedom 
central  chi-aquares  were  generated  and  ordered.  A  value  for 
each  of  the  five  test  statistics  was  calculated  and  saved. 

This  process  was  repeated  ten  thousand  times.  The  upper 
portion  of  the  ten  thousand  values  for  FI  was  ordered  and  the 
five  percent  and  fifteen  percent  points  were  found.  This 
gave  the  estimated  critical  values  for  a  stage  one  test  of 
an  interaction  term  containing  ten  degrees  of  freedom  using 
FI  as  a  test  statistic.  The  critical  values  were  found  in 
the  same  manner  for  F2,  F3,  F4,  and  F5.  This  process  was 
repeated  for  twenty  and  thirty  degrees  of  freedom  in  inter¬ 
action. 

Stage  two  critical  values  for  ten  degrees  of  freedom 
interaction  terms  and  a  =  0.05  were  estimated  by  again 
generating  values  for  the  test  statistics  in  the  same  manner 
as  above.  If  generated  numbers  of  the  test  statistics  exceeded 
the  0.05  critical  values  with  ten  degrees  of  freedom  for  inter¬ 
action  at  stage  one,  the  test  statistic  for  stage  two  was 
formed  and  saved.  This  was  repeated  until  two  thousand  values 
at  stage  two  were  accumulated.  The  upper  portion  was  ranked 
and  the  estimate  of  the  0.05  critical  value  for  stage  two  was 
found.  The  same  procedure  was  followed  to  find  the  table 
values  for  a  =  0.15  and  so  on  for  twenty  and  thirty  degrees 
of  freedom  of  interaction. 

The  calculation  of  stage  three  critical  values  is  an 
extension  of  the  stage  two  procedure.  Critical  values  under 
the  null  hypothesis  were  calculated  and  if  they  exceeded  the 
appropriate  critical  values  of  both  stage  one  and  stage  two 
the  test  statistic  for  stage  three  was  formed  and  saved  until 
two  thousand  were  accumulated.  They  were  then  ordered  as  be¬ 
fore  and  the  estimates  of  the  five  percent  and  fifteen  percent 
critical  values  were  found.  The  complete  table  of  critical 
values  generated  is  found  in  Table  2.  The  critical  values 
do  not  extend  past  stage  three  because  of  the  length  of  com¬ 
puter  time  that  would  be  necessary  to  generate  stage  four 
critical  values. 
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TABLE  2 


CRITICAL  VALUES  FOR  FI,  F2, 
F3,  F4,  and  F5 


n  is  the  total  degrees  of  freedom  associated  with  the 
interaction  term  being  tested. 

a  is  the  level  of  significance. 


TEST 

STATTcJTTr 

n 

a 

Stage 

1 

2 

3 

H 

1  n 

m 

13.7882 

35.6391 

■  108.8423 

■■ 

±  u 

Kb 

9.0107 

19.6191 

46.5067 

IIH 

7  n 

BS 

12.0037 

20.3695 

32.4610 

IQ 

L  u 

m 

8.9826 

13.8655 

19.9743 

llll 

BB 

11.9484 

17.4037 

23.5462 

o  u 

WBa 

9.1645 

12.7221 

15.9907 

!■ 

1  n 

BS 

84376.4338 

14924046.1125 

4099285578.0629 

IQ 

JL  U 

BB 

8119.5734 

190156.7131 

4723313.6463 

IQ 

.05 

421750.6897 

157984650.5641 

42188251909.4520" 

4  u 

.15 

4273.3229 

1462966.2431 

51961752.0038 

im 

.05 

1060700.3502 

330771314.1788 

97140244926.5100 

.15 

108524.4512 

5007498.1783 

183702308.8679 

!■ 

10 

.05 

.6051 

.2182 

.0911 

.15 

.5003 

.  2283 

.1112 

IQ 

20 

.05 

.2258 

n 

.15 

.2093 

.1357 

.2918 

.1986 

.1389' 

j  \j 

■a 

.  2401 

.1788 

.1279 

Q 

10 

Ba 

.6051 

.4495 

.3263 

Q 

■a 

.5003 

.4153 

.3174 

20 

R9 

.3474 

.2838 

m 

■a 

.3032. 

.2597 

Q 

mi 

.2771 

.2412 

o  u 

Ral 

.2380 

.2133 

mi 

1.5231 

1.7930 

2.1168 

m 

.9985 

1.1222 

1.3075 

F5 

.05 

.6677 

.7151 

.  IS 

.4849 

.5207 

mm 

.4129 

.4315 

.4443 

.15J^ 

.3177 

.3190 

.3302 
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5.  CHOICE  OF  g.  It  may  be  desirable  to  make  the  test 
for  interaction  at  a  relatively  small  alpha  rather  than  a 
large  one.  A  small  a  under  :  ^^1  =  ^2=  .  •  •  = 

may  lead  to  an  inflated  estimate  of  o2  by  way  of  the  se¬ 
quential  test  because  when  no  significance  is  found  the 
test  procedure  is  halted  and  the  error  sum  of  squares  is 
calculated.  A  test  using  a  small  alpha  may  not  find  inter¬ 
action  when  it  is  present  thus  leading  to  an  inflated  esti¬ 
mate  of  a2.  Therefore,  any  tests  of  other  factors  in  the 
model  using  the  inflated  error  would  be  conservative.  With 
this  in  mind,  critical  values  for  alpha  equal  to  0.05  and 
,0.15  were  estimated. 

It  should  be  noted  that  the  level  of  significance  must 
remain  the  same  at  all  stages  of  the  test  when  using  the 
critical  values  developed  here.  For  example,  it  is  not 
appropriate  to  test  at  stage  one  using  a  =  0.15  and  after 
finding  significance  to  test  at  stage  two  using  a  =  0.05. 

6.  GENERATION  OF  POWER  DATA.  Power  in  a  sequential 

test  IS  an  elusive  concept.  For  this  reason,  power  at  stage 
one  is  defined  to  be  the  probability  of  rejecting  the  null 
hypothesis,  Hg:  =  X2  =  •  .  •  =  =  0,  given  the  null 

hypothesis  is  false.  Power  at  stage  two  is  the  probability 

of  rejecting  the  null  hypothesis,  Hj :  Xi[  =  X2  “  •  •  •  “  ^n-l“ 
given  the  null  hypothesis  is  false. 

Data  generated  to  compare  the  power  of  the  five  test 
statistics  were  divided  into  two  cases.  Case  one  consisted 
of  generating  ten,  twenty,  or  thirty  standard  normal  de¬ 
viates,  adding  a  single  non-centrality  parameter,  X^,  to 
one  of  these  at  random,  and  squaring  each.  The  result  was 
one  non- central  and  (n-1)  central  chi-squares.  The  sequen¬ 
tial  test  procedure  was  thenperformed  using  one  of  the  test 
statistics  at  a  level  of  significance  a.  This  was  repeated 
one  thousand  times  adding  the  same  non-centrality  parameter, 
Xj^,  to  a  new  set  of  standard  normal  deviates  and  keeping  a 
record  of  the  number  of  times  significance  was  declared.  An 
estimate  of  power  for  the  test  statistic,  at  a,  n  degrees  of 
freedom  for  interaction,  and  X^  at  stage  one  was  calculated 
by  dividing  the  number  of  times  significance  was  declared 
by  one  thousand.  The  above  process  was  repeated  for  every 
possible  combination  of  test  statistics^  levels  of  signi¬ 
ficance,  number  of  degrees  of  freedom  for  interaction,  and 
non-centrality  parameters.  The  non-centrality  parameters  are 


Xj  =  1.5,  X2  =  2.5,  X3  =  3.5,  and  X^  =  4.5.  The  sequential 
test  for  power  in  case  one  was  not  carried  past  the  first 
stage.  The  experiment  was  repeated  once  to  form  an  estimate 
of  experimental  error. 

A  test  for  power  at  both  stage  one  and  stage  two  was 
performed  in  case  two  data,  n  random  standard  normal  devi¬ 
ates  were  again  generated  and  non-centrality  parameters 
were  added  to  two  randomly  selected  standard  normals  before 
squaring.  The  sequential  test  was  applied  and  the  process 
repeated  one  thousand  times  keeping  count  of  the  total  num¬ 
ber  of  times  significance  was  declared.  Each  time  signi¬ 
ficance  was  found  the  test  would  proceed  to  stage  two  to 
test  for  significance  and  a  tally  was  kept  of  the  number  of 
times  the  null  hypothesis  was  rejected. 

For  a  certain  a,  test  statistic,  n  degrees  of  freedom 
of  interaction,  and  set  of  non-centrality  parameters,  power 
at  stage  one  was  the  number  of  times  significance  was  found 
divided  by  one  thousand  while  power  at  stage  two  equaled 
the  number  of  times  the  null  hypothesis  was  rejected  at 
stage  two  divided  by  the  total  number  of  tests  made.  (The 
total  number  of  tests  made  at  stage  two  was  the  number  of 
times  significance  was  declared  at  stage  one.) 

The  above  power  for  case  two  was  calculated  indepen¬ 
dently  for  each  combination  of  degrees  of  freedom  of  inter¬ 
action,  test  statistics,  levels  of  significance,  and  pairs 
of  non- centrality  parameters.  As  in  case  one,  the  experiment 
was  replicated  once.  There  were  ten  different  pairings  of 
Xi,  Xj  added  to  form  non-central  chi-squares.  These  are 
listed  in  Table  3. 


Table  3 

Pairings  of  Non-centrality  Parameters 
Added  for  Case  Two  Power 


^1 


1.5 

2.5 

3.5 

4.5 

1.5 
1.5 

1.5 

2.5 

2.5 

3.5 


^2 


1.5 

2.5 

3.5 

4.5 

2.5 

3.5 

4.5 

3.5 

4.5 
4.5 
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7.  GENERATION  OF  MEAN  SQUARE  ERROR  DATA.  As  the  above 
procedure , for  power  was  being  performed  data  for  an  analysis 
o£  the  ability  o£  the  test  statistics  to  estimate  was 
also  being  compiled. 

As  each  set  o£  ten,  twenty,  or  thirty  chi-squares  was 
generated  £or  case  one  data,  the  test  procedure  would  check 
£or  signi£icance  at  di££erent  stages  until  none  was  found. 

It  would  then  tally  the  sum  o£  squares  and  degrees  o£  £reedom 
to  be  pooled  into  error.  This  would  proceed  until  all  one 
thousand  sets  were  tested.  The  estimate  o£  a2  was  then  cal¬ 
culated  by  dividing  the  total  sums  o£  squares  pooled  into 
error  by  the  pooled  degrees  o£  £reedom.  I£  signi£icance  was 
£ound  at  each  o£  the  £irst  three  stages  in  any  o£  the  one 
thousand  sets,  (n-3)  degrees  o£  £reedom  and  the  sums  o£ 
squares  not  declared  signi£icant  were  added  to  error.  Since 
these  data  were  calculated  simultaneously  with  the  power 
there  are  two  independent  observations  £or  all  combinations 
o£  test  statistics,  degrees  o£  £reedom  in  interaction,  non¬ 
centralities  ,  and  levels  o£  signi£icance .  The  case  one  mean 
square  error  data  were  calculated  £or  £ive  Xi,  £our  being 
the  same  as  in  the  power  analysis  and  the  £i£th  being  equal 
to  zero . 

Mean  square  error  data  £or  case  two  were  generated  simul¬ 
taneously  with  case  two  power  data.  As  both  a  stage  one 
power  test  and  stage  two  power  test  were  per£ormed  for  case 
two  data,  mean  square  error  data  were  also  collected  at  both 
the  stage  one  power  test  and  stage  two  power  test.  Case 
two  mean  square  error  data  will  be  labeled  and  discussed  in 
terms  of  stage  of  power  test.  This  avoids  the  problem  of 
thinking  of  the  MSB  data  as  "stage  one  MSB"  and  "stage  two 
MSB"  which  carries  the  wrong  connotation  since  both  errors 
are  estimated  using  the  three-stage  sequential  procedure. 

Mean  square  error  data  at  stage  one  power  test  vere 
collected  as  follows.  The  sequential  (up  to  three  stages) 
procedure  was  applied  to  each  set  of  n  single  degree  of  free¬ 
dom  interaction  sum  of  squares.  If  non-significance  occurred 
at  stage  one  all  n  sums  of  squares  were  pooled  into  the  error 
estimate.  When  significance  was  declared  at  stage  one  but 
not  at  stage  two  (n  -  1)  sums  of  squares  were  pooled  into 
the  error  estimate  and  with  significance  at  stages  one  and 
two  but  not  at  stage  three  (n  -  2)  sums  of  squares  were  pooled 
into  the  error  estimate.  It  was  decided  if  significance  was 
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found  at  all  three  stages  that  the  remaining  (n  -  3)  sums 
of  squares  would  be  pooled  into  error.  Thus,  each  of  the 
one  thousand  sets  of  n  sums  of  squares  contributed  some¬ 
thing  to  the  estimate  of  error. 

Mean  square  data  at  stage  two  power  test  were  collected 
in  a  different  manner  than  at  stage  one  power  test.  The 
same  three  stage  sequential  procedure  was  applied,  but  only 
to  those  sets  of  n  sums  of  squares  which  were  declared  sig¬ 
nificant  at  the  stage  one  power  test.  If  non-significance 
was  observed  at  the  stage  one  power  test,  then  the  set  of 
n  sums  of  squares  did  not  become  a  part  of  the  error  estimate 
at  the  stage  two  power  test.  Thus  fewer  than  one  thousand 
sets  of  n  sums  of  squares  were  used  in  the  stage  two  power 
test  estimate.  One  might  say  that  the  mean  square  error 
calculated  at  stage  two  power  test  is  "adjusted"  for  those 
cases  where  non- significance  was  found  at  stage  one  power  test 

This  procedure  was  repeated  for  each  combination  of 
n,  F,  a,  and  pairings  of  Xi,  Xj.  The  entire  process  was 
replicated  so  that  two  independent  estimates  of  error  were 
obtained  at  each  design  point. 

The  mean  square  error  data  at  stage  one  power  test  are 
the  values  of  interest  in  this  paper.  They  will  be  larger 
than  the  mean  square  error  values  calculated  at  stage  two 
power  test  because  the  sums  of  squares  and  degrees  of  freedom 
are  pooled  into  the  mean  square  error  at  stage  two  power  test 
only  if  significance  was  found  at  stage  one  power  test.  This 
means  that  the  largest,  individual  sum  of  squares  that  is  not 
declared  significant  at  stage  one  is  never  pooled  into  the 
mean  square  error  at  stage  two  power  test.  If  one  decided 
to  estimate  only  when  significance  was  found  at  the  first 
stage  of  the  sequential  procedure  then  the  values  of  mean 
square  error  at  stage  two  power  test  would  give  a  picture  of 
the  results  one  might  expect  from  the  test  statistics.  How¬ 
ever,  if  one  wanted  an  estimate  of  independent  of  signi¬ 
ficance  being  declared  at  stage  one  of  the  sequential  pro¬ 
cedure  the  mean  square  error  data  generated  at  stage  one  power 
test  one  will  indicate  which  is  the  best  test  statistic. 

8.  METHOD  OF  ANALYSIS  OF  DATA.  Analysis  of  variance 
was  used  to  analyze  the  data  generated  for  case  one  power. 

A  four-way  factorial  model  complete  with  all  interactions 
was  formed  using  degrees  of  freedom  of  interaction  (n) ,  test 
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statistic  (F) ,  non-centrality  parameter  (X),  and  signifi¬ 
cance  level , (a), for  the  four  main  effects.  Degrees  of  free¬ 
dom  of  interaction  had  three  levels  (ten,  twenty,  and  thirty), 
test  statistics  had  four  levels  (FI,  F2,  F3,  and  F5) ,  non¬ 
centrality  parameters  had  four  levels  (1.5,  2.5,  3.5,  and 
4.5),  and  alpha  had  two  levels  (0.05  and  0.15).  F4  was  left 
out  of  the  analysis  in  case  one  because  power  wasn't  extended 
past  stage  one  and  at  stage  one  F3  and  F4  are  the  same  test 
statistic.  The  main  effects  for  this  model  and  for  all  models 
in  this  paper  were  considered  fixed. 

The  dependent  variable  in  the  power  analysis  is  a  pro¬ 
portion.  In  case  one  data  one  thousand  independent  tests 
for  power  were  made  for  each  combination  of  n,  F,  a,  and  X. 

The  proportion  was  formed  by  dividing  the  number  of  times 
the  null  hypothesis  wj 3  rejected  by  the  total  number  of  tests 
made. 


Because  of  the  range  of  non-centralities  used  to  ge  ?rate 
the  data,  it  is  possible  that  the  assumption  of  homogene; as 
variance  in  each  cell  is  violated.  For  this  reason,  the  arc¬ 
sine  transformation,  as  described  by  Snedecor  and  Cochran 
(1967) ,  was  used  on  the  data  but  ver^  little  difference  was 
found  between  the  analysis  of  the  ra’  data  and  that  of  the 
transformed  data  so  the  analysis  of  che  raw  data  was  used. 

Case  two  power  data  were  analyzed  using  a  five-way  fac¬ 
torial  model.  The  five  main  effects  were  degrees  of  free¬ 
dom  for  interaction  (ten,  twenty,  and  thirty),  alpha  (0.05 
and  0.15),  test  statistic  (FI,  F2,  F3,  F4,  and  F5),  non¬ 
centralities  (the  ten  pairs  in  Table  3) ,  and  stage  (stage 
one  and  stage  two).  The  number  of  binomial  results  going 
into  each  observation  of  case  two  power  data  varied  with 
stage.  At  stage  one,  one  thousand  binomial  results  went 
into  each  observation  while  at  stage  two  the  number  of  bino¬ 
mial  results  that  went  into  each  observation  were  the  number 
of  times  significance  was  declared  out  of  the  one  thousand 
trials  at  stage  one.  This  is  because  the  sequential  test 
procedure  doesn't  proceed  to  stage  two  unless  significance 
occurs  at  stage  one.  Analysis  was  performed  on  the  raw 
data  and  also  a  weighted  arc-sine  transformation  of  the  data, 
weighted  by  the  number  of  binomial  results  making  up  each 
observation.  Very  little  difference  was  found  in  the  results 
between  the  two  analyses  and  so  only  the  analysis  of  the  raw 
data  will  be  considered  here. 
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Before  describing  the  method  of  analyzing  the  mean 
square  error  data,  consideration  of  what  would  be  the  best 
estimate  of  mean  square  error  by  a  test  statistic  in  this 
paper  will  be  made.  Ideally,  the  test  statistic  would 
identify  any  single  degree  of  freedom  sums  of  squares  that 
have  interaction  in  them  and  pool  into  error  only  the  sums 
of  squares  that  truly  estimate  error.  Each  single  degree 
of  freedom  that  estimates  error  is  a  central  chi-square 
with  one  degree  of  freedom  and  with  expected  value  equal 
to  one.  Since  the  expectation  of  a  sum  of  central  chi- 
squares  is  equal  to  the  sum  of  their  degrees  of  freedom, 
the  expected  value  of  the  pooled  sum  of  squares  of  error 
when  all  interactions  have  been  extracted  by  the  test  sta¬ 
tistic  is  equal  to  the  pooled  degrees  of  freedom.  The 
expected  mean  square  error  would  then  be  equal  to  one.  If 
the  test  statistic  fails  to  remove  all  of  the  interaction 
the  expected  mean  square  would  be  greater  than  one.  If  the 
test  statistic  using  the  sequential  procedure  pools  only 
part  of  the  single  degree  of  freedom  sums  of  squares  that 
estimate  into  error  the  resulting  mean  square  error 
would  be  less  than  one  on  the  average.  This  is  because  the 
sums  of  squares  of  error  left  in  interaction  would  be  the 
largest  sums  of  squares,  not  just  any  sums  of  squares  se¬ 
lected  at  random,  leaving  the  smaller  for  error  thus  de¬ 
creasing  the  expected  value  of  mean  square  error.  Hence, 
for  the  data  generated  here,  the  ideal  test  statistic  would 
yield  an  estimate  of  error  having  an  expected  value  equal 
to  one. 

Analysis  of  variance  was  also  used  to  analyze  the  mean 
square  error  data  of  case  one  and  case  two.  Although 
heterogeneity  of  variance  exists ,  since  the  observations 
are  central  or  non-central  chi-squares,  Scheffe"  (1959) 
notes  that  if  an  analysis  is  balanced  the  heterogeneity  of 
variance  has  little  consequence.  This  was  seen  in  the 
analysis  of  the  raw  and  transformed  power  data.  The  analysis 
of  case  one  and  case  two  mean  square  error  data  was  performed 
on  the  untransformed  dependent  variable  using  the  error  es¬ 
timate  produced  by  replication  to  test  terms  in  the  model. 

The  ANOVA  model  for  case  one  and  case  two  mean  square 
error  were  the  same  as  for  power  with  three  exceptions.  F4 
was  added  to  the  levels  of  the  main  effect  for  test  statis¬ 
tics  in  case  one  since  it  will  estimate  mean  square  error 
differently  than  F3.  Zero  was  added  to  the  levels  of  the 
main  effect  for  non-centralities  to  investigate  the  ability 
of  the  test  statistic  to  estimate  when  no  interaction  is 
present. 
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The  authors  of  this  paper  subscribe  to  the  philosophy 
that  when  it  is  not  desirable  or  possible  to  control  main 
effects  in  an  experiment  it  is  proper  to  test  for  signi¬ 
ficance  among  the  levels  of  main  effects  in  the  presence 
of  interaction.  This  also  applies  to  the  testing  of  low 
ordered  interactions  in  the  presence  of  significant  higher 
ordered  interactions.  The  analyst  must  realize,  however, 
that  the  main  effects  and  low  ordered  interactions  have  been 
averaged  over  all  other  factors  in  the  model  and  any  inter¬ 
pretation  of  significance  must  be  viewed  in  this  light. 

The  analysis  of  the  power  and  mean  square  error  data 
will  be  discussed  a  case  at  a  time  instead  of  discussing 
power  completely  and  then  mean  square  error. 

9.  RESULTS  AND  DISCUSSION  OF  CASE  ONE  DATA.  Table  4 
is  the  analysis  of  variance  table  for  case  one  power  data 
and  Table  5  is  the  table  for  case  one  mean  square  error 
data.  Significance  was  found  for  almost  every  term. 

The  first  thing  to  be  considered  is  alpha.  Figure  1 
contains  graphs  of  power  and  mean  square  error  for  the  F 
by  a  interaction. 

The  graph  of  power  in  Figure  1  indicates  that  the  power 
is  better  using  a  larger  alpha  which  is  not  surprising, 
but  the  graph  of  mean  square  error  shows  that  a  better  es¬ 
timate  of  mean  square  error  is  obtained  using  a  =  0.15 
since  the  line  for  a  *  0.15  is  closer  to  one  than  that  for 
a  =  0.05.  Table  5  shows  significance  for  main  effect  a 
which  indicates  that  using  a  =  0.15  for  case  one  data  gives 
a  better  estimate  of  mean  square  error. 

Now  consider  Figure  2  which  contains  graphs  for  the 
power  and  mean  square  error  of  the  F  by  X  by  a  =0.15 
interaction  term. 

There  is  no  significant  difference  between  the  power 

curves  of  FI,  F3,  F4,  and  F5  so  power  offers  no  help  as  to 

which  test  statistic  is  the  best  other  than  that  the  power 
of  F2  is  lacking.  The  graph  of  mean  square  error  in  Figure 
2  shows  that  F2  also  lacks  in  ability  to  estimate  mean 
square  error.  There  is  no  practical  difference  between  the 
points  of  FI,  F3,  F4,  and  F5  for  mean  square  error  at  X  =  0, 

1.5,  2.5.  At  X  =  3.5,F3  is  significantly  higher  than  the 
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TABLE  4 


ANOVA  Table  for  Case  One  Power  Data 


Source 

DF 

MS 

F 

n 

2 

0.0030 

23.5678 

F 

3 

1.3084 

9997.0462 

nF 

6 

0.0009 

7.3491 

a 

1 

1.0141 

7748.2736 

na 

2 

0.0026 

20.5578 

Fa 

3 

0.0001 

0.8867^ 

nFa 

6 

0.0002 

1.7236^ 

A 

3 

3.0563 

23351.5011 

nX 

6 

0.0010 

8.1844 

FA 

9 

0.2475 

1891.2634 

nFA 

18 

0.0004 

3.5115 

aX 

3 

0.0097 

74.6679 

naX 

6 

0.0002 

2.1278' 

FaA 

9 

0.0019 

15.0068 

nFaX 

18 

0.0001 

1.1544' 

ERROR 

96 

0.0001 

*  Indicates  that  the  term  was  not  significant  at  the 
.05  level.  No  *  by  the  F  value  indicates  significance  was 
declared  at  the  .05  level. 
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TABLE  5 


ANOVA  Table 

for  Case 

One  Mean  Square 

Error  Data 

• 

Source 

DF 

MS 

p* 

n 

2 

1.1613 

4570.3185 

F 

4 

1.2276 

4831.2999 

nF 

8 

0.0866 

341.0213 

a 

1 

0.7622 

2999.7058 

na 

2 

0.0759 

299.0747 

Fa 

4 

0.0010 

4.2709 

nFa 

8 

0.0005 

2.1856 

X 

4 

0,8472 

3334.1864 

nX 

8 

0.1453 

571.9092 

FA 

16 

0.3739 

1471.8028 

nFA 

32 

0.0295 

116.1555 

aX 

4 

0.0517 

203.4774 

naA 

8 

0.0077 

30.3502 

FaA 

16 

0.0026 

10.4372 

nFaX 

32 

0.0005 

2.0556 

ERROR 

150 

0.0002 

*  All  tests  are  significant  at  the  .05  level. 


other  three  and  at  X  =  4.5,  F5  separates  from  FI  and  F4. 

At  X  =  4.5  FI  and  F4  underestimate  error  while  F3  over¬ 
estimates  error  and  F5  estimates  error  exactly. 

The  problem  with  F2  is  that  it  will  find  significance 
if  the  smallest  sum  of  squares  is  sufficiently  small  with¬ 
out  regard  to  the  size  of  the  largest  sum  of  squares.  Even 
if  the  largest  sum  of  squares  is  large  it  will  not  be  de¬ 
clared  significant  unless  the  smallest  sum  of  squares  is 
sufficiently  small.  Thus,  F2  has  poor  power  and  greatly 
overestimates  mean  square  error. 

At  X  ■  4.5,  F3  estimates  to  be  1.023.  This  is  sig¬ 
nificantly  different,  using  Scheffer’s  test  at  a  =  0.05, 
compared  to  the  F5  estimate  of  1.000.  As  X  gets  large,  F3 
tends  to  overestimate  .  This  is  due  to  the  presence  of 
the  non- central  chi-square  in  the  denominator  of  F3. 


FI  and  F4  have  the  same  numerator 


E 

i=n-r+l  r 

which  leads  to  their  underestimation  of  at  X  =  4.5.  The 
test  for  mean  square  error  in  case  one  only  goes  as  far  as 
stage  three.  Any  single  degree  of  freedom  sum  of  squares 
declared  significant  at  stage  one  will  remain  in  the  numera¬ 
tor  for  the  stage  two  test.  One  large  single  degree  of 
freedom  if  interaction  sum  of  squares  could  easily  cause  a 
type  one  error  at  stage  two  because  of  the  inflated  numera¬ 
tor  of  the  test  statistic.  This  would  lead  to  an  underes¬ 
timation  of  a^. 

To  further  investigate  FI  and  F4  consider  the  graph  of 
n  by  F  by  a  =  0.15  interaction  on  mean  square  error  which 
is  shown  in  Figure  3. 

The  points  of  FI  and  F4  for  n  =  30  are  lower  than  one. 
As  the  number  of  individual  sums  of  squares  gets  larger  the 
probability  of  a  large  central  chi-square  being  present 
increases.  The  numerators  of  FI  and  F4  will  be  inflated 
at  stage  two  with  one  significant  individual  sum  of  squares 
and  a  large  central  chi-square  present.  Thus  a  type  one 
error  at  stage  two  and  possibly  at  stage  three  could  occur. 
This  would  keep  large  central  chi-squares  from  being  pooled 
into  error  and  would  cause  an  underestimate  of  o  . 
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Mean  Square  Error 


1.6 


30  at  Case 


10.  RESULTS  AND  DISCUSSION  OF  CASE  TWO  DATA.  Tables 
6  and  7  contain  the  analysis  of  variance  tables  tor  case 
two  power  and  mean  square  error  data  respectively.  Signi¬ 
ficance  was  found  for  every  term  in  both  tables. 

To  find  the  better  a  for  case  two  consider  Figure  4 
which  is  the  F  by  a  interaction  on  power  and  F  by  a  by 
stage  one  power  test  of  interaction  on  mean  square  error. 

As  in  case  one  a  =  0.15  estimates  mean  square 
better  than  a  =  0.05  but  Figure  4  shows  that  the  a  -  0.15 
curve  isn't  as  close  to  one  as  it  was  in  case  one  data. 

This  suggests  that  when  two  individual  sum  of  squares 
associated  with  interaction  are  present,  using  a  higher  a 
will  better  estimate  a^.  Figure  4  also  shows  that  FZ  nas 
poor  power  and  greatly  overestimates  mean  square  error. 

For  these  reasons  F2  will  be  dropped  from  any  further  dis 
cuss  ion. 

Figure  4  also  shows  that  FI  and  F4  have  the  best  power 
of  the  five  test  statistics.  This  is  further  illustrated 
by  Figure  5,  a  graph  of  F  by  X  at  a  =  0 . 15  interaction  on 

power. 

The  power  of  FI,  F3,  F4,  and  F5  are  very  close  when 
pairs  of  X  are  equal,  but  when  the  pairs  of  X  become  un¬ 
equal  the  pattern  changes.  As  the  difference  between  the 
non-centralities  gets  larger  the  difference  in  power  between 
FI  and  F4  compared  to  F5  and  F3  also  spreads.  The  reason 
for  this  becomes  obvious  after  seeing  Figures  6  and  7. 

Figure  6,  which  is  F  by  X  by  a  =  0 . 15  by  stage  one  on 
power,  shows  no  practical  difference  in  power  between  FI, 

F3,  F4,  and  F5,  but  Figure  7,  which  is  F  by  X  by  a  0.15 
by  stage  two  on  power,  shows  wide  differences  in  power. 

The  differences  in  Figure  5  originate  in  Figure  7  since 
Figures  6  and  7  make  up  Figure  5.  Figure  7  is  power  at 
stage  two  or  rejecting  H  :  X^  =  X2  =  •  •  •  -  ^n-1 
it  is  false.  The  real  difference  in  power  between  F5  com¬ 
pared  with  FI  and  F4  begins  as  the  pairs  of  non-centralities 
start  to  spread.  FI  and  F4  have  better  power  because  the 
significant  sum  of  squares  at  stage  one  is  still  in  the 
numerator  and  when  it  combines  with  the  smaller  non-centra¬ 
lity  significance  is  still  found.  At  the  same  time  F5  and 
F3  are  testing  the  smaller  non-centrality  „ 

finding  it  significant  as  often  as  FI  and  F4.  As  the  smaller 
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TABLE  6 


ANOVA  Table  for  Case  Two  Power  Data 


Source 

DF 

MS 

F* 

n 

2 

2.0076 

6662.1818 

F 

4 

4.7802 

15862.9815 

nF 

8 

0.1806 

599.5040 

a 

1 

9.3725 

31102.2565 

na 

2 

0.0699 

231.9816 

Fa 

4 

0.0824 

273.6274 

nFa 

8 

0.0035 

11.7148 

X 

9 

2.8400 

9424.7329 

nX 

18 

0.1017 

337.8058 

FX 

36 

0.1657 

550.1733 

nFX 

72 

0.0078 

25.8975 

aX 

9 

0.0291 

96.6522 

naX 

18 

0.0093 

31.1496 

Fax 

36 

0.0026 

8.6435 

nFaX 

72 

0.0007 

2.5451 

r 

1 

0.6061 

21922.1602 

nr 

2 

0.2618 

868.7970 

Fr 

4 

0.3465 

1149.9805 

nFar 

8 

0.0427 

141.7871 

or 

1 

0.0652 

216.6943 

nor 

2 

0.0072 

23.9867 

Far 

4 

0.0349 

116.1258 

nFar 

8 

0.0027 

8.9653 

Xr 

9 

0.6665 

2211.8393 

nXr 

18 

0.0260 

86.2968 

FXr 

36 

0.0540 

179.3832 

nFXr 

72 

0.0026 

8.8321 

aXr 

9 

0.0042 

14.1705 

naXr 

18 

0.0021 

7.1075 

FaXr 

36 

0.0024 

8.2095 

nFaXr 

72 

0.0004 

1.5963 

ERROR 

600 

0.0003 

t  r  represents 

stage  of  power 

test 

*  Each  term  is 

significant  at 

the  .05 

level . 
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TABLE  7 


ANOVA  Table 

for  Case  Two  Mean 

Square 

Error  Data 

Source 

DF 

MS 

F* 

n 

2 

37.5888 

62218.3820 

F 

4 

8.5351 

14127.7110 

nF 

8 

0.2126 

351.9482 

a 

1 

6.5986 

10922.3722 

na 

2 

0.8028 

1328.8303 

Fa 

4 

0.0343 

56.9299 

nFa 

8 

0.0106 

17.6582 

X 

9 

6.7756 

11215.3054 

nX 

18 

2.3379 

3869.8933 

FX 

36 

0.3266 

540.7006 

nFX 

72 

0.0078 

13.0703 

aX 

9 

0.3890 

643.9728 

naX 

18 

0.0964 

159.6656 

FaX 

36 

0.0086 

14.3329 

nFaX 

72 

0.0036 

5.9810 

r 

1 

126.5534 

209475.7356 

nr 

2 

30.6301 

50700.0991 

Fr 

4 

0.5413 

896.1425 

nFr 

8 

0.0734 

121.5168 

ar 

1 

5.2837 

8745.8149 

Far 

4 

0.0951 

157.5172 

nFar 

8 

0.0332 

55.0646 

X£ 

9 

1.3656 

2260.4472 

nXr 

18 

1.0383 

1718.7646 

FXr 

36 

0.1123 

186.0435 

nFXr 

72 

0.0308 

51.0854 

aXr 

9 

0.1382 

228.7546 

naXr 

18 

0.0425 

70.5036 

FaXr 

36 

0.0046 

7.7365 

nFaXr 

72 

0.0027 

4.6175 

ERROR 

600 

0.0006 

+  r  represents  stage 

*  Each  term  is  significant  at  the  .05  stage. 
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MEAN  SQUARE  ERROR  POWER 


Figure  4.  Power  vs.  a  =  O.OS  and  a  =  0.15  and  MSE 
vs.  a  -  0.05  and  a  =  0.15  at  Stage  One  Power  Test  and 
Case  Two  for  FI,  F2,  F3,  F4,  and  F5. 
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non-centrality  gets  larger  the  power  of  F3  and  F5  also 
increases.  This  property  of  FI  and  F4  builds  their  power 
but  may  not  help  their  ability  to  estimate  mean  square  error. 
Figure  8  is  a  graph  of  the  F  by  A  by  a  =  0.15  by  stage  one 
power  test  interaction  on  mean  square  error. 

The  only  three  places  that  mean  square  error  of  FI  and 
F4  are  significantly  closer  to  one  than  the  mean  square 
error  of  F5  are  where  the  non-centralities  are  (1.5,  4.5), 
(2.5,  4.5),  and  (4.5,  4.5).  This  is  due  to  the  numerators 
of  FI  and  F4  being  inflated  with  4.5  while  F5  is  testing 
1.5,  2.5,  and  4.5  alone.  This  may  be  fine  for  a  test  using 
ot  =  0.15,  but  if  a  =  0.25  were  being  used,  the  structure  of 
FI  and  F4  could  cause  them  to  seriously  underestimate  a2 , 
whereas  F5  would  not  have  an  inflated  numerator  nor  inflated 
denominator  as  F3.  This  is  what  happened  when  testing  data 
with  one  non- centrality  of  4.5  present  in  case  one  as  illus¬ 
trated  in  Figure  2.  Figure  4  contains  the  points  in  Figure 
8  averaged  over  non- centrality.  From  Figure  4  at  a  =  0.15 
the  average  mean  square  error  values  are  1.401  for  F5,  1.391 
for  F4,  and  1.396  for  FI .  These  differences  can  be  attri¬ 
buted  to  the  differences  observed  in  Figure  8  at  points 
where  the  added  non-centralities  were  (1.5,  4.5),  (2.5,  4.5), 
and  (4.5,  4.5).  The  differences  in  the  ability  of  FI,  F4, 
and  F5  to  estimate  error  variance  averaged  over  everything 
except  a  and  stage  one  power  test  are  of  no  practical  impor¬ 
tance. 

Figure  9  is  analogous  to  Figure  3  in  case  one.  It  is 
the  n  by  F  by  a  =  0.15  by  stage  one  power  test  interaction 
for  mean  square  error. 

At  n  =  10  the  value  of  F5  is  significantly  closer  to  one 
than  FI  and  F4.  But  as  the  sample  size  increases  to  n  =  20 
and  n  =30,  FI  and  F4  are  significantly  closer  to  one  than 
F5.  This  is  because  a  large  central  chi-square  is  more  likely 
to  be  present  as  the  sample  size  increases.  And  the  inflated 
numerators  of  FI  and  F4  tend  to  declare  a  portion  of  the  large 
central  chi-squares  significant  whereas  F5  does  not.  If  a 
large  a  were  being  used,  FI  and  F4  may  underestimate  error 
whereas  F5  may  avoid  this  problem  because  of  its  structure. 

11.  ANALYSIS  OF  DATA  IN  TABLE  1  USING  SEQUENTIAL 
PROCEDURE . Table  8  is  an  analysis  of  variance  table  of  the 
data  contained  in  Table  1. 


145 


MEAN  SQUARE  ERROR 


TABLE  8 


ANOVA  Table  for  Data 

in  Table  1 

Source 

DF 

SS 

MS 

A 

5 

21221.0 

4244.2 

B 

1 

3798.5 

3798.5 

AB 

5 

6893.9 

1378.8 

C 

4 

5310.0 

1327.5 

AC 

20 

4433.0 

221.7 

BC 

4 

291.8 

73.0 

ABC 

20 

2784.2 

139.2 

ERROR 

0 

O 

o 

0.0 

TOTAL 

59 

44732.4 

The  AC  and  ABC  interaction  terms  were  partitioned  into 
single  degrees  of  freedom  sums  of  squares  and  the  sequen¬ 
tial  procedure  using  FI,  F3,  F4,  and  F5  was  applied  to  the 
data.  No  indication  of  interaction  was  found  using  a  =  0.15 
in  either  the  AC  or  ABC  term.  Thus,  both  could  be  pooled 
into  error  giving  an  estimate  of  equal  to  180.43,  however 
interaction  could  be  present  in  most  or  all  of  the  single 
degree  of  freedom  sums  of  squares  of  AC  and  ABC,  which  may 
lead  to  a  type  two  error  using  the  sequential  procedure. 

12.  CONCULSIONS.  Based  on  the  results  of  this  paper, 
FI  and  F4  may  be  as  good  a  test  statistic  as  F5  if  the  re¬ 
maining  sums  of  squares  are  pooled  into  error  when  signifi¬ 
cance  is  declared  at  stage  three.  F5  estimates  better 
in  case  one  data  then  FI  and  F4  but  in  case  two  data  there 
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is  no  practical  difference.  If,  however,  more  complete 
tables  were  available  (higher  significance  levels  and 
critical  values  for  more  than  three  stages)  the  authors 
would  recommend  F 5  as  the  best  of  the  five  test  statistics. 
F5  avoids  the  pitfalls  of  FI  and  F4  which  would  probably 
manifest  themselves  in  much  greater  detail  if  critical 
values  for  more  stages  and  larger  a  were  available. 

As  far  as  level  of  significance  is  concerned  0.15  is 
recommended  over  0.05  because  of  the  better  estimate  of 
0^  given.  As  the  number  of  individual  sums  of  squares 
associated  with  interaction  increases  a  larger  value  of  a 
will  better  estimate  o^.  This  can  be  seen  by  comparing 
Figure  1  with  Figure  4.  The  results  indicate  that  with  a 
higher  a,  perhaps  0.25,  would  be  estimated  with  less 
bias  than  at  a  =  0.15. 

These  conclusions  can  only  be  strictly  applied  to  the 
data  analyzed  in  this  paper.  Any  extension  to  three  or 
more  individual  sums  of  squares  containing  interaction 
without  further  research  is  speculation. 
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PLANNING  QUANTAL  RESPONSE  TESTS  FOR  ORDNANCE 
DEVICES:  THE  TWO-POINT  STRATEGY 

R.  E.  Little 
School  of  Engineering 
The  University  of  Michigan 
Dearborn,  Michigan 


ABSTRACT.  This  paper  presents  a  small  sample  strategy 
that  should  prove  to  be  useful  in  predicting  high  reliability 
(or  high  safety)  for  ordnance  devices.  The  recommended 
"two-point"  strategy  was  developed  by  the  author  for  analogous 
use  in  estimating  fatigue  reliability. 

Briefly,  the  "two-point"  strategy  incorporates  the  well- 
known  up-and-down  (Bruceton)  strategy  in  its  first  stage 
to  generate  two  (nonzero,  nonunity  probability)  points  along 
the  assumed  response  distribution  curve.  Then,  in  its  Second 
stage,  the  strategy  allocates  the  remaining  specimens  to  the 
two  corresponding  stimulus  levels  such  that  the  variance  of 
the  point  estimate  pertaining  to  the  reliability  (safety) 
of  interest  is  minimized. 

In  essence,  the  issue  is  to  find  the  specimen  allocations 
which  minimize  the  variance  associated  with  extrapolation 
along  the  fitted  response  distribution  to  a  point  remote 
to  the  median.  Optimally,  this  minimization  requires  testing 
certain  specific  proportions  of  the  available  specimens  at 
carefully  selected  specific  stimulus  levels. 

1.  INTRODUCTION.  The  sensitivity  of  explosive  devices 
to  shock  loading  cannot  be  measured  directly.  Rather,  the 
explosive  device  must  be  subjected  to  some  arbitrary  shock 
loading  and  if  the  given  device  explodes  we  know  that  the 
imposed  shock  loading  exceeded  its  tolerance  to  shock  loading. 
On  the  other  hand ,  if  the  given  device  does  not  explode ,  then 
we  know  that  the  imposed  shock  loading  did  not  exceed  its 
tolerance  to  shock  loading.  Conducting  similar  shock  loading 
tests  at  various  (stimulus)  levels  generates  the  following 
quantal  response  test  program: 
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Number 

Stimulus  Level  Number  of  Specimens  Responding 

(e.g.,  drop  height)  Tested  (e.g.,  exploding) 


The  problem  of  interest  herein  is  how  to  select  s^  and 

n^  such  that  we  obtain  the  most  precise  estimate  of  the  critical 

stimulus  level  s^  corresponding  to  a  very  low  (high)  probability 

of  responding  p,  e.g.,  0.001  or  even  0.00001  (0.999  or  even 
0.99999).  Specifically  we  shall  describe  our  two-point  test 
program  and  estimation  method  [1,2].  The  two-point  strategy 
requires  considerably  fewer  specimens  than  current  techniques 
such  as  the  run  down  method  [3] . 

2.  OPTIMAL  REGRESSION  BACKGROUND.  The  following  discussion 
is  intended  to  serve  as  background  material  for  the  subsequent 
summary  of  the  two-point  stictegy. 

2.1.  Simple  Linear  Regression  Example.  Consider  the 
problem  of  most  precise  estimation  of  the  slope  3  for  the 
simple  linear  model 


Y  =  a  +  3x  +  e  (1) 

2 

Assuming  a  homoscedastic  variance  a  ,  the  variance  of  3  is 
given  by  the  expression 


(3) 


(x^  -  x) ^ 


(2) 


Elementary  analysis  (or  intuition)  shows  the  takes  on 

its  minimum  value  when:  (a)  only  two  levels  of  x^  are  used 

in  testing,  (b)  these  levels  are  spaced  as  widely  apart  as 
practical,  and  (c)  ^^otal^^  specimens  are  tested  at  each  of 

the  two  X.  levels,  where  n.  .  ^  is  the  fixed  number  of 

1  total 

specimens  available  for  testing. 
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This  elementary  example  illustrates  the  minimum  variance 
strategy  in  planning  test  programs.  Namely,  select  the 
stimulus  levels  and  allocate  the  test  specimens  such  that  we 
minimize  the  variance  of  some  estimate  of  direct  interest. 

This  minimum  variance  strategy  may  be  applied  to  models  with 
heteroscedastic  variances  and  with  time  and/or  cost  constraints 
[2]. 

2.2.  Optimal  Regression  Derivations  for  Linear  Response 
Curves.  We  shall  now  discuss  minimum  variance  estimation  of 
a  point  on  the  linear  response  curve 

y  =  f“^(p)  =  a  +  gs  (3) 


in  which  s  refers  to  the  stimulus  level  and  p  =  F (y)  is 
the  distribution  of  interest  (e.g.,  normal,  logistic,  extreme 
value-smallest) .  The  heteroscedastic  binomial  variance 
associated  with  sampling  at  a  given  stimulus  level  is 


o]-)  =  pg/n 


(4) 


in  which  p  is  the  true  probability  of  responding,  q  =  (1  -  p) , 
and  n  is  the  number  of  specimens  tested  at  the  given  stimulus 
level. 


We  may  now  use  the  variance  expression  for  p  to  obtain 
a  variance  expression  for  the  variate  y,  using  the  simple 
9  2  2 

relation  a  (aX)  =  a  a  (X)  and  the  assumed  distribution  p  =  F (y) 
to  obtain  dp/dy,  viz.. 

Now  by  analogy  with  the  simple  linear  regression  example 
above,  we  conduct  response  tests  at  just  two  stimulus  levels. 
Specifically,  we  test  n^  specimens  at  stimulus  level  s^  and 

n2  specimens  at  stimulus  level  S2f  where  n^^  +  n2  =  ^total 

specified  prior  to  testing.  We  assume  that  r^^  specimens 

respond  during  the  tests  at  s^  and  r2  respond  at  S2.  Hence, 

the  respective  proportions  responding  are  p^^  = 

p«  =  r^/n^-  These  p.  values  are  then  used  to  compute  the 

2  2  Z  1  A  A 

corresponding  y^^  values  using  the  relationship  y^^  =  F  (Pj^)  t 

in  which  p  =  F (y)  is  the  distribution  function  assumed  for 
the  response  curve. 
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The  response  curve  of  interest  appears  in  Figure  1. 
Two  parameter  distributions  plot  as  a  straight  line  on 
appropriate  probability  paper,  passing  through  the  two 
points  [(y^,  s^^) ,  (y2/  S2)].  Hence, 


a  =  {yiS2  -  y2®i^/<®2  ■  ®l’ 


(6) 


and 


(7) 


Then,  for  any  point  along  the  line,  say  (y^,  s_) ,  we 
write 


yo  =  “  6=0 '  s'  -  s,  *  if^  <'i 

and,  since  and  are  independent,  we  see  that 


2  '  ^  2  2 
<^0^  9(y, )  '^1 


+  r  ^^^0^  i2  2. 
3(y,)  (^2^ 


(9) 


in  which 

9{yo)  3(yo) 

—  =  (S2  -  Sq)/(s2  -  S^)  and  -  Sq)/(s2  -  s^) 


9(yi) 

:t 

the  notation 


9(yo) 


2/\  2  /s 

Next,  we  substitute  o  .  and  a  .  into  (9)  and  introduce 

(yi)  (y2) 


"i”!  ■ 


(11) 


to  obtain 


a 


2^ 

(Yq) 


(S2  Sq)  ^  ^^1  ~  ^0^  ^ 


"I'^l 


(12) 


Our  problem  now  is  to  minimize  (12)  by  appropriate  selection 
of  n^^,  n2f  Sj^,  and  S2. 


(10) 
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Figure  1.  Response  curve  plotted  on  probability  paper 
passing  through  the  points  [ (y, ,  s, ) ,  (y„. 


First,  consider  optimum  allocation  of  n^^  and  ^2  for 

given  values  of  Sj^  and  S2-  Substitute  n^^  =  -  n2 

into  (12),  and  set  the  derivative  of  (12)  with  respect  to 
n2  equal  to  zero.  We  thus  obtain  the  expression 

1  ("2  -  =0)'  ("1  -  V" 

■  “  u  —  o  L  o  " 


3n. 


(S2  -  Sj_) 


<”total  -  "2^  ''l 


2 

"2  ''2 


] 


(13) 


Equation  (13)  is  satisfied  when 

1  1 

n,  w«  -x  -  s.  w-  ■5-  Yo  “  Yn 

^cT  -"s")  =  -  (  .^.-.--Q) 

^2  '^l  ®1  ®0  ''l  ^1  ^0 


(14) 


where  the  plus  sign  pertains  to  extrapolation  and  the 
minus  sign  pertains  to  interpolation. 

Substituting  (14)  back  into  (12)  gives  (after  some 
algebra) 


”total<®2  -  ®l)^ 


(S2  -  Sq)  (Sj^  -  Sq)  ^2 


(15) 


^total(y2  -  yi> 


<y2  -  yp^  ^  <yi  ~  yp)  ^2 


where  again  the  plus  sign  pertains  to  extrapolation  and 
the  minus  sign  pertains  to  interpolation.  This  variance 
expression  may  now  be  minimized  by  appropriate  selection 
of  y^  and  Y2^ 

Taking  the  derivatives  of  (15)  with  respect  to  y^ 
and  y^  and  equating  these  derivatives  simultaneously  to 
zero  shows  that  the  optimum  values  of  y^^  and  Y2  are  indepen¬ 
dent  of  the  value  of  y^  of  specific  interest.  However, 

because  of  the  complex  nature  of  the  w,  p  (w,  y)  relationship, 
the  optimum  values  must  be  determined  numerically,  refer  to 
Table  1. 
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Distribution  Optimum  y  Optimum  p 


yi 

^2 

Pi 

P2 

Normal 

-1.575 

+1.575 

0.058 

0.942 

Logistic 

-2.399 

+2.399 

0.083 

0.917 

Extreme  Value 

-2.073 

+1.269 

0.118 

0.971 

-  Smallest 


Table  1.  Optimum  y  and  g  values  for  minimum  variance 
estimation  of  y^. 

NOTE:  Remarkably  the  optimum  values  also  pertain 

to  minimum  variance  estimation  of  but  the 
corresponding  optimal  allocations  differ.  The 
optimum  allocations  for  minimum  variance  estimation 
of  0  satisfy  n^/n2  = 


Value  of  Yq 

Variance  Ratio 
(Normal  Distribution) 

-  1.575 

1.000 

-  2.0 

1.16 

-  3.0 

4.6 

-  4.0 

63.5 

Table  2.  Ratio  of  transformed  binomial  variance  .  for 

lYo' 

all  tests  conducted  at  stimulus  level  s^, 

to  the  optimal  regression  variance  .  .  These 

example  results  pertain  to  the  normal  distribution 
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2.2.1.  Discussion  of  Results.  It  is  helpful  in  under- 
standing  the  results  summarized  in  Equation  (15)  and  Table  1 
to  plot  w  versus  p.  Refer  to  Figure  2.  Here  we  see  that  the 
weight  w  approaches  zero  as  p  approaches  zero  or  one  (viz . , 
as  y  approaches  minus  infinity  or  plus  infinity) .  This  w, 
p  (w,  y)  relationship  indicates  that  if  we  attempt  to  separate 
s^  and  S2  too  widely,  the  variance  of  y^  increases  because 

w  in  the  denominator  of  Equation  (15)  approaches  zero.  On  the 
other  hand,  if  we  do  not  separate  s^^  and  S2  enough,  then  the 

term  (S2  ”  denominator  is  too  small.  Thus,  there 

are  unique  values  of  Sj^  and  S2  (independent  of  Sq)  which 

minimize  (15)  —  not  too  far  apart  and  not  too  close  together. 

It  is  also  helpful  in  understanding  the  optimal  (weighted) 
regression  results  herein  to  compare  the  variances  of  y^ 

associated  with  optimal  regression  and  with  direct  testing 
at  the  single  stimulus  level  s^  corresponding  to  y^,  refer 

to  Table  2.  Here  we  see  that  optimal  regression  is  much  more 
efficient  than  direct  testing.  The  reason  for  the  increased 
efficiency  is  essentially  that,  as  evident  in  Figure  2,  direct 
testing  at  very  low  or  very  high  p  values  is  extremely 
inefficient  because  the  weights  w  are  almost  zero  (i.e.,  the 
transformed  binomial  variability  is  so  large) .  The  optimal 
regression  strategy,  on  the  other  hand,  allocates  specimens 
to  stimulus  levels  where  the  weights  are  not  only  much  higher 
than  the  weights  associated  with  direct  testing  at  extreme 
values  of  p,  but  it  also  minimizes  the  increase  in  the  variance 
of  Yq  associated  with  extrapolation.  It  is  clear  from  the 

results  siammarized  in  Table  2  that  optimal  regression  is 
remarkably  suited  to  the  problem  of  estimating  stimulus  levels 
corresponding  to  very  high  and  to  very  low  probability  of 
response. 


2.2.2.  Application  to  Ordnance  Problems.  The  optimum 
values  of  p  in  Table  1  are  too  close  to  zero  and  one  to  have 
direct  application  in  ordnance  problems.  The  difficulty  lies 
in  selecting  s^^  and  S2  such  that  we  do  not  obtain  all  response 

or  all  non-responses  at  either  Sj^  or^S2.  If  either  situation 

occurs,  we  cannot  establish  the  two  y  values  required  to 
specify  the  fitted  distribution.  Thus,  to  use  the  optimal 
regression  results  ^irectly,  we  require  very  accurate  initial 
estimates  of  a  and  B.  This  requirement  is  of  course  quite 
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Figure  2 . 
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Plot  of  w,  p  relationships  for  the  normal, 
logistic,  and  extreme  value-smallest  distributions 
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impractical.  Thus,  we  must  modify  the  optimal  regression 
strategy  to  make  sure  that  we  can  always  establish  the  two 
required  y  values.  The  modified  procedure  is  termed  the 
two-point  strategy. 

3.  THE  TWO-POINT  STRATEGY.  There  are  two  versions  of 
the  two-point  strategy,  one  for  small  samples,  say  fifty 
specimens  or  less,  and  one  for  large  samples,  say  one 
hundred  or  more  specimens. 

3.1.  Small  Sample  Procedure.  The  small  sample  procedure 
is  as  follows:  (1)  conduct  the  beginning  portion  of  the  test 
program  using  the  up-and-down  strategy  illustrated  in  Figure  3, 
(2)  change  over  to  testing  at  only  two  stimulus  levels  s,  and 

A  r 

S2  as  soon  as  two  finite  values  of  y  are  established  by  the 

up-and-down  portion  of  the  test  program,  and  (3a)  allocate  the 
test  specimens  to  s^^  and  s^  as  the  test  progresses  using 

Equation  (14)  to  decide  between  testing  at  or  S2r  or  (3b) 

proceed  as  in  (3a)  except  test  at  the  two  stimulus  levels 
corresponding  to  the  optimum  values  of  p  in  Table  1.  (These 
two  levels  may  be  updated  as  the  test  progresses.  The  iterative 
procedure  may  be  quite  worthwhile  when  s^  and  s^  are  closely 

spaced  ^^^ . ) 

The  up-and-down  portion  of  the  two-point  test  program 
should  generally  be  undertaken  with  the  uniform  spacing 
between  successive  stimulus  levels  chosen  to  be  approximately 
equal  to  the  standard  deviation  of  the  underlying  response 
distribution.  If  the  spacing  is  too  narrow,  the  resulting 
values  of  s^^  and  S2  in  the  two-point  testing  portion  of  the 

program  will  generally  be  too  close  together  to  permit  precise 
estimation  of  y^.  On  the  other  hand,  if  the  spacing  is  too 

wide,  the  up-and-down  portion  of  the  test  program  tends  to 
be  quite  long,  with  the  successive  test  outcomes  alternating 
back  and  forth  between  response  and  nonresponse.  Thus,  a 
reasonably  accurate  estimate  of  the  standard  deviation  of 
the  assumed  underlying  distribution  is  mandatory,  viz.,  there 


(a)  Ideally  the  investigator  has  a  computer  program  which 
records  the  given  test  outcome  and  provides  the  stimulus 
level  for  the  next  test.  Otherwise,  the  computations  may 
take  place  at  convenient  intervals  as  the  test  program 
progresses. 
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Code; 


X  denotes  Response, 


0  denotes  Nonresponse 


Stimulus  Test  Number 

Level  12345678910 


2.0 

1.7 

1.4 

1.1 

0.8 


XXXXO 

0000000000000000X0 


Up-and-down  Testing 
(a) 


Two-point  Testing 
(b) 


Data  Summary: 

/V 


®i 

n . 

1 

r . 

1 

Pi 

2.0 

1 

1 

1.000 

1.7 

3 

3 

1.000 

1.4 

9 

6 

0.667 

1.1 

20 

2 

0.100 

Figure  3.  The  two-point  test  program  consists  of:  (a)  a 

beginning  up-and-down^ series  of  tests  to  establish 
two  finite  y  values  (p  values  not  equal  to  zero  or 
one),  followed  by  (b)  tests  conducted  at  two 
stimulus  levels,  s^^  and  ^2'  which  specimens  allocated 

to  Sj^  or  S2  as  the  overall  test  progresses  such 

that  text  Equation  (14)  is  satisfied. 


NOTE:  The  up-and-down  test  strategy  is  as  follows; 

The  outcome  of  any  given  test  determines  the 
stimulus  level  used  in  the  next  test.  For  example, 
the  second  specimen  responded  (denoted  X) ,  thus 
the  third  specimen  was  tested  at  a  lower  stimulus 
level.  On  the  other  hand,  the  third  specimen  did 
not  respond  (denoted  0)  and  therefore  the  fourth 
specimen  was  tested  at  the  next  higher  stimulus 
level.  Uniform  spacing  between  adjacent  stimulus 
levels  is  used  for  convenience,  but  is  not  mandatory. 
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must  be  some  preliminary  testing  or  some  prior  experience  to 
form  a  basis  for  selecting  the  spacing  of  the  stimulus  levels 
used  in  testing.  Generally  an  estimate  of  the  standard 
deviation  a  that  is  accurate  within  plus  or  minus  fifty 
percent  is  adequate,  but  it  is  preferable  that  the  spacing 
d  fall  in  the  range  a  <  d  <  (3a/2) .  The  advantage  of  the 
iterative  procedure  (3b)  increases  as  d  is  decreased  below  a. 

Many  readers  will  probably  opt  for  the  simplified  test 
method  and  analysis.  In  this  case  we  merely  ignore  the 
tests  conducted  at  stimulus  levels  other  than  s^  and  S2 

(refer  to  Figure  3)  and  estimate  the  fitted  distribution  by 
drawing  a  straight  line  through  the  two  points  [ (yT ,  s, ) , 

(y-)^  s^)].  The  variance  of  y^.  is  then  estimated  using 

Equation  (12)  and  reading  w  from  Figure  2. 


If  it  does  not  seem  advisable  to  ignore  tests  at  stimulus 
levels  other  than  and  s^/  the  variance  of  y^  may  be 

estimated  using  the  general  expression 


<^0^  ^n.w.(s. 


“  ^  2 
s^  -  s  ) 

0  w 


The  w.  values  in  (16)  may  be  approximated  either  by  empirical 
weights  (i.e.,  based  on  the  observed  y^  values),  or  fitted 
weights  (e.g.,  based  on  maximum  likelihood  analyses  [2]). 


3.1.1.  Numerical  Example  (Simplified  Analysis) .  Given 
the  quantal  response  data  in  Figure  3  (ignoring  the  tests 
at  stimulus  levels  other  than  1.4  and  1.1),  viz.. 


1.4  9  6  0.667 

1.1  20  2  0.100 


estimate  s^  corresponding  to  p  =  0.001  and  sketch  the  lower 

95%  (asymptotic)  confidence  band.  Assume  an  underlying 
normal  distribution. 
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Solution.  First,  we  shall  check  the  allocation  of  n, 

and  n-,  relative  to  the  final  values  of  p.  For  p,  =  0.100, 

from  normal  tables  equals  -  1.28;  and  for  P2  =  0.667, 

equals  +  0.43.  Moreover , ^f or  Pq  =  0.001,  Yq  -  ~  3.09. 

The  corresponding  values  of  w  are  0.34  and  0.60  respectively. 
Thus,  using  (14) 


0.60 

0.34 


+  0.43 
-  1.28 


-  (  -  3.09) 

-  (  -  3.09) 


]  =  2.6 


whereas  the  actual  value  is  20/9  =2.2.  This  discrepancy 
means  that  if  further  tests  were  conducted,  the  first  few 
additional  tests  should  be  conducted  at  s,  =  1.1  unless 

of  course  the  p  values  change  marlcedly  as  the  data  accumulate. 


The  fitted  response  distribution  passes  through  the 
points  [(1.1,  -  1.28),  (1.4,  +  0.43)],  giving  the  response 

expression 


y  =  -  7.55  +  5.07s 


Hence,  yQ  =  -  3.09  (Pq  =  0.001)  corresponds  to  Sq  equal 
0.78.  In  turn,  using  (12) 


2^  1 
a  ,  .  w  — - 9 

(^0^  (1.4  -  1.1)2 

Thus 


(1.4  -  0.78)2  (1.1  -  0.78)2 

20  X  0.34  9  X  0.60 


The  corresponding  lower  95%  asymptotic  confidence  band 
appears  in  Figure  4.  Note  that  we  can  be  approximately 
95%  confident  that  99.9%  of  all  specimens  will  survive  a 
stimulus  level  of  0.22. 


3.2.  Large  Sample  Procedure.  The  large  sample  proce¬ 
dure  is  based  on  information  obtained  by  response  tests 
conducted  using  the  previous  small  sample  procedure.  Namely, 
approximately  fifty  specimens  are  tested  using  the  small 
sample  procedure  to  estimate  s*  and  s*  corresponding  to  the 

optimum  p  values  in  Table  1.  Then,  given  this  information. 
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Figure  4.  Plot  of  fitted  response  curve  and  the  associated 
lower  95%  confidence  band  for  the  test  example. 
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Stimulus  Level 


the  remaining  specimens  are  tests  at  sj  or  at  s*2  losing  (14) 

for  appropriate  allocation;  or  else  each  successive  specimen 
may  be  tested  at  that  stimulus  level  which  minimizes  (16) 
as  the  data  accumulate.  The  latter  iterative  procedure  is 
enhanced  by  a  digital  computer  program  compiled  and  placed 
in  a  file  ready  for  execution  by  remote  terminal. 

4 .  SUMMARY .  The  procedure  is  straightforward:  (a) 
select  the  appropriate  values  of  the  stimulus  level,  and 
(b)  allocate  the  tests  at  these  stimulus  levels  such  that 
the  variance  of  the  desired  point  estimate  is  minimized. 
Usually  the  variance  of  the  desired  point  estimate  may  be 
reduced  markedly  merely  by  considering  a  few  alternative 
stimulus  levels  before  testing  (using  Figure  2  and  Equation 
16) .  But  the  variance  of  the  point  estimate  may  be  reduced 
even  further  by  adopting  certain  minimum  variance  strategies. 
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TECHNIQUES  FOR  STATISTICALLY  DETERMINING  FLIGHT 
SUITABILITY  OF  AN  ARTILLERY  PROJECTILE 


Ronald  Corn 
Gertrude  Weintraub 
Ammunition  Development  and 
Engineering  Directorate 
Picatinny  Arsenal 
Dover,  New  Jersey 


ABSTRACT.  The  M483  155mm  Projectile  being  tested  at 
Nicolet,  Canada,  to  evaluate  aeroballistic  performance  at 
high  air  density  exhibited  flight  instability.  The  authors 
were  responsible  for  determining  cause  of  problem,  correcting 
the  problem  and  developing  the  statistical  technique  necessary 
for  predicting  success.  The  projectile  design  modifications 
evolved  successfully  passed  retesting  at  Nicolet  and  the  pro¬ 
jectile  has  been  released  for  production.  The  induced  yaw 
technique  for  disturbing  projectiles  as  they  exit  the  gun  tube, 
developed  during  this  program,  is  currently  being  used  on  other 
developmental  projectiles  and  will  be  used  to  evaluate  aero¬ 
dynamic  stability  of  all  future  Howitzer  type  projectiles. 

The  statistical  techniques  used  to  predict  success  which 
also  permitted  a  minimal  expenditure  of  projectiles  were: 

a.  A  Weibull  mathematical  model  was  selected  and  imple¬ 
mented  to  predict  point  estimates  and  confidence  level  estimates 
of  reliability  and  percentage  points  based  upon  the  maximum 
likelihood  estimates  of  the  parameters  of  a  Weibull  population. 
This  model  afforded  excellent  theoretical  descriptive  character¬ 
istics  of  the  density  and  probability  distributions  of  the 
empirical  test  data  which  were  symmetrical  and  asymmetrical 
in  form. 


b.  Automated  computer  programs  especially  adapted  to 
the  Weibull  model  were  employed  to  derive  density  and  proba¬ 
bility  distribution  curves. 

c.  Probability  plotting  methods  were  implemented  to 
describe  the  adequacy  of  the  theoretical  distributions  to  the 
empirical  test  data. 

1.  INTRODUCTION.  The  M483  projectile  development  which 
was  completed  in  1971  provided  an  important  new  155mm  capa¬ 
bility  to  the  US  Army.  Figure  1  depicts  an  M483  projectile 
alongside  of  the  standard  155mm  M107  projectile.  Because  of 
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the  obvious  increase  in  size  and  cargo  volume,  oyer  50%  of 
the  standard,  the  M483  configuration  is  being  utilized  for  a 
variety  of  projectiles  whose  mission  is  to  deliver  cargo  on 
to  a  target  area  (e.g.  chemical,  smoke,  illuminating  and  sub¬ 
munition). 

To  accommodate  the  increased  cargo,  the  M483  projectile  is 
over  6  calibers  in  length  and  utilizes  an  aluminum  ogive  and  ^ 
base  and  fiberglass  wrapped  body  to  minimize  weight  and  distri¬ 
bute  it  properly  for  aerodynamic  considerations .  Because  of 
its  unique  shape ,  comparatively  little  knowledge  of  its  aero¬ 
dynamic  characteristics  was  available  prior  to  1974  when  sur¬ 
prisingly  poor  performance  was  exhibited  in  cold  weather  tests. 

In  1974  a  cold  weather  test  program  was  conducted  at 
Nicolet,  Canada,  located  between  Montreal  and  Quebec  along 
the  Saint  Lawrence  River  (Figure  2).  Nicolet  provides  an 
existing  Canadian  test  facility  which  permits  projectile 
firings  at  near  Arctic  conditions  to  evaluate  aeroballistic 
performance  at  high  air  density  (in  excess  of ^110%  of  standard), 
which  tends  to  amplify  aerodynamic  instabilities. 

On  14  Feb  74,  20  each  M483  projectiles  were  fired  with 
a  standard  US  Propellant  charge  whose  weight  was  adjusted  to 
obtain  a  velocity  of  Mach  0.93.  At  these  Arctic  conditions 
this  Mach  number  was  predicted  to  be  the  most  severe  aero- 
dynamically.  The  impact  point  of  13  of  those  projectiles 
which  exhibited  normal  flight  performance  is  shown  in  Figure 
3.  These  projectiles  impacted  on  expected  ranges  of  approx¬ 
imately  6300  meters.  Seven  of  the  twenty  projectiles  impacted 
between  2000  and  3300  meters  short  of  the  impact  area  as  shown 
in  Figure  4 . 

Production  of  the  M483  was  suspended  as  a  result  of  the 
incident  at  Nicolet  and  an  intensive  program  initiated  to 
determine  the  cause  of  the  erratic  performance  at  Nicolet. 
Initially  a  fault  tree  was  configured  (Figure  5)  and  an  in¬ 
vestigative  program  was  developed  based  upon  fault  tree 
elements . 

To  determine  whether  the  cause  of  the  problem  was  routed 
in  interior  or  exterior  ballistics,  it  was  necessary  to  con¬ 
duct  a  highly  instrumented  series  of  firings  which  for  the 
first  time,  would  obtain  initial  yaw  characteristics  of  a 
statistical  sample  of  in-flight  projectiles,  as  well  as  pro¬ 
jectile  range  information  for  those  same  projectiles.  Figure 
6  shows  the  test  site  at  Yuma  Proving  Ground.  Cameras  and 
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yaw  cards  were  used  to  independently  measure  launch  angles 
of  the  projectile  while  radar  and  standard  triangulation 
techniques  were  used  to  determine  flight  characteristics  and 
dowm  range  impact  points.  Launch  velocities  were  adjusted 
from  standard  US  velocities  to  duplicate  the  critical  mach 
number  of  the  Nicolet  tests  by  modification  of  propelling 
charges . 

The  results  of  the  initial  tests  showed  that  the  M483 
problem  was  primarily  an  exterior  ballistic  problem  and  that" 
in  fact,  the  aeroballistic  characteristics  of  the  projectile 
were  unsatisfactory.  Ms.  Weintraub’s  application  of  statis- 
tical  techniques  proved  invaluable  for  predicting  performance 
and  follows  in  detail. 

2.  STATISTICAL  TECHNIQUES  USED  IN  A  FLIGHT  SUITABILITY 
INVESTIGATION.  At  the  outset,  I  want  to  take  this  opportunity 
to  express  my  gratitude  to  Mr.  Corn  and  his  associates  in  this 
stability  investigation.  They  were  open-minded  and  willing  to 
draw  upon  statistical  disciplines  to  assist  them  in  resolving 
an  engineering  problem.  The  result  of  the  cohesive  union  of 
engineering  and  statistics  proved  successful. 

A  complex  problem  was  solved  when  a  probabilistic  ap¬ 
proach  was  applied  to  analyze  real  world  test  data.  Professor 
John  Tukey  of  Princeton  would  probably  refer  to  the  statistician’s 
efforts  in  our  data  analysis  as  exploratory  and  probabilistic 
and  the  end  result  as  confirmatory.  Our  greatest  gains  in 
analyzing  empirical  data  came  from  surprises,  which  I  will  ex¬ 
plain  a  little  later. 

In  this  case,  the  engineering  community  succeeded  in 
ferreting  out  the  causes  for  short  rounds  (defined  as  those 
which  fail  to  fly  to  full  range)  and  redesigned  the  projectile 
to  eliminate  the  occurance  of  short  rounds. 

As  statistician,  I  entered  the  picture  after  the  following 
events  had  occurred: 

1.  On  10  Feb  74,  seven  out  of  twenty  standard  M483  pro¬ 
jectiles  fired  at  critical  Mach  number  (0.93)  from  the  109A1 
Howitzer  flew  approximately  half  range. 

2.  The  engineering  community  undertook  an  investigation 
by  designing  a  test  program  to  determine  the  cause  of  these 
short  rounds.  The  program  wasacomplex  and  ambitious  one  and 
sought  to  determine  whether  the  problem  was  either  interior  or 
exterior  ballistic  related. 
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3.  Aerodynamic  knowledge  at  the  start  of  the  investigation 
supported  the  belief  that  the  M483  was  stable  up  to  a  first 
maximum  yaw  angle  of  8® . 

The  first  test  conducted  at  Yuma  Proving  Grounds  was  with 
the  standard  M483  fired  at  critical  mach  number  in  order  to 
correlate  first  maximum  yaw  angle  with  range.  The  yaw  angles 
were  obtained  with  yaw  cards  and  cameras  as  back  up,  the  test 
set  up  is  shown  in  Figure  6.  The  yaw  cards  were  set  approxim¬ 
ately  100  feet  forward  of  the  gun. 

Figure  7  is  a  plot  of  the  first  maximum  yaw  angle  vs. 
range  and  the  first  surprise  of  this  test  program  was  that  the 
critical  yaw  angle  was  5-6°  and  not  8°  as  previously  predicted. 
Critical  yaw  angle  is  defined  as  the  angle  above  which  the 
projectile  becomes  aerodynamically  unstable  and  does  not  fly 
full  range. 

The  yaw  angles  generated  from  20  tests  conducted  with  the 
standard  M483  projectile  (varying  its  internal  cargo,  tubes  and 
muzzle  brakes)  were  presented  for  analysis.  As  had  been  done 
on  other  problems,  a  probabilistic  design  approach  was  used. 

Yaw  angle  was  considered  the  continuous  random  variable  and  the 
problem  was  to  examine  the  distribution  of  yaw  angle.  I  chose 
to  fit  a  Weibull  distribution  model  since  it  afforded  me  a  use¬ 
ful  mathematical  tool  for  describing  the  probability  distribution 
function  and  the  density  function  of  symmetrical  and  asymetrical 
forms.  Figure  8  shows  a  spectrum  of  distributional  forms  which 
can  be  described  by  a  Weibull  model  (see  Figure  9  for  the  pdf 
and  density  mathematical  forms  of  the  Weibull  distribution). 

In  terms  of  a  statistical  probability  distribution,  the 
distribution  of  yaw  angles  for  the  standard  M483  Projectile 
fired  from  a  50%  worn  tube  at  Yuma  is  seen  on  Figure  10.  It 
was  determined  that  this  condition  tube  produced  the  highest 
first  maximum  yaw  distribution  and  this  tube  was  used  for  most 
of  the  testing. 

Maximum  likelihood  estimates  of  the  parameters  of  a 
Weibull  population  were  determined  based  upon  the  iteration 
procedures  for  joint  maximum  likelihood  estimation  of  the  3 
parameters  of  the  Weibull  population  described  by  Harter  and 
Moore  in  their  notes  contained  in  Technometrics,  Volume  7, 

No.  4,  November  1965.  The  asymptotic  variances  and  covariances 
of  maximum-likelihood  estimators  were  then  employed  in  deriving 
confidence  interval  estimates  for  probabilities  based  upon  the 
MLE  estimates.  The  latter  confidence  interval  estimates  were 
derived  with  the  assistance  of  Dr.  Einbinder  and  members  of  the 
Computer  Programming  Facility  at  Picatinny  Arsenal. 
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Based  upon  the  maximum  likelihood  estimates  of  the  3 
Weibull  parameters,  one  could  expect  33%  of  the  standard  M483 
Projectiles  fired  from  the  50%  worn  tube  to  exceed  5®.  And,  in 
fact,  at  Nicolet,  Canada,  7  out  of  20  (35%)  fell  short.  This 
gave  further  credence  to  the  low  critical  maximum  yaw  angle 
premise. 

The  fitted  yaw  distribution  function  also  indicated  that 
for  the  standard  M483  to  fly  full  range,  its  critical  first 
maximum  yaw  angle  must  be  greater  than  13® .  At  this  critical 
yaw  angle  one  can  expect  no  more  than  one  short  range  projectile 
in  a  million  rounds. 

Thereafter,  the  investigative  test  program  was  directed 
to  assessing  the  effects  of  system  parameter  changes  on  the 
yaw  angle  distribution  and  the  design  of  modifications  that 
would  have  high  critical  yaw  angles.  The  system  parameters 
investigated  included:  new  tubes  and  worn  tubes,  with  and 
without  muzzle  brakes,  and  cargo  variation.  It  appeared  that 
the  greatest  effect  on  yaw  angle  level  was  the  presence  or 
absense  of  a  muzzle  brake  on  the  end  of  the  gun  tube. 

Figure  11  shows  how  absence  of  a  muzzle  brake  improves 
the  yaw  angle  probability  distribution  of  the  standard  M483. 

Now  only  7  in  10,000  rounds  are  expected  to  exceed  the  5® 
critical  yaw  angle  in  lieu  of  33%  with  a  muzzle  brake.  This 
frequency  was  also  too  high  to  be  acceptable. 

The  real  problem  facing  the  engineering  task  team  was  to 
design  a  projectile  modification  whose  critical  angle  exceeded 
13® ,  since  as  previously  shown  no  more  than  one  short  range 
round  in  a  million  would  be  expected  at  this  critical  yaw  angle. 

After  many  design  modifications,  and  statistical  analyses 
of  these  changes,  two  modifications  of  the  standard  M483  were 
built  and  tested:  Figure  12  describes  the  modifications  made 
to  the  standard  M483;  Figure  13  compares  the  yaw  angle  probabil¬ 
ity  distribution  functions  obtained  for  Mods  1  and  2  when  tested 
with  the  50%  worn  tube  with  muzzle  brake.  For  each  Mod,  it  was 
found  that  one  in  a  million  rounds  would  exceed  8®  first  maximum 
yaw  angle. 

Since  the  modifications  were  designed  to  be  more  stable 
than  the  standard  M483,  a  technique  had  to  be  devised  for  deter¬ 
mining  how  much  more  stable  they  were  and  also  their  critical 
yaw  angle. 

Since  it  had  been  determined  that  muzzle  brakes  signifi¬ 
cantly  effected  yaw  angles,  modified  muzzle  brakes.  Figure  13A, 
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were  designed  and  tested  as  a  means  of  inducing  even  greater 
yaw  angles  to  evaluate  design  modifications.  First  maximum 
yaw  angles  of  as  high  as  20°  were  obtained. 

Figure  14  illustrates,  visually,  by  means  of  a  yaw  card 
comparison,  the  large  angle  from  which  the  modified  rounds 
will  still  damp  and  fly  normal  ranges  as  compared  to  the 
original  M483  projectile,  Figure  15. 

An  interim  Picatinny  Report  dated  March  1975  has  been 
published  covering  this  work.  Figures  16  and  17  show  the 
adequacy  of  the  Weibull  model  in  describing  the  empirical 
distribution  characteristics  of  test  data  for  the  standard 
M483  round  and  for  design  modification  2. 

This  probability  plotting  method  was  used  to  assess  the 
goodness  of  fit  of  the  theoretical  Weibull  model  to  the  em¬ 
pirical  test  data. 

Figures  18  and  19  show  the  density  function  for  the 
standard  M483  and  design  modification  2.  Each  of  the  distri¬ 
butions  is  right-skewed,  but  we  can  see  that  modification  2 
shows  a  significantly  smaller  dispersion  around  the  mean. 

Summing  up,  therefore,  what  modification  2  accomplished 
is  two-fold: 

1.  It  yielded  a  significantly  smaller  dispersion  of 
first  maximum  yaw  angle  around  the  mean,  one  in  a  million 
exceeds  8°  vs.  33%  exceeding  5°  for  the  standard  M483. 

2.  It  produced  a  more  stable  projectile,  critical  angle 
greater  than  18°  vs.  5-6°  for  the  standard  M483. 

3.  CONCLUSION.  A  real  world  engineering  problem  was ^ 
resolved  with  the  assistance  of  probability  methods.  Statis¬ 
tical  analyses  were  helped  immeasurebly  by  computer  software 
programs  which  were  available.  These  programs  afforded  rapid 
assessment  of  design  modifications  and  comparisons.  The 
efforts  could  not  possibly  have  been  accomplished  in  as  short 
a  time  without  the  computer.  The  computer  program  of  Drs. 
Harter  and  Moore  of  Wright-Patterson  Air  Force  Base  was  used 
extensively  to  derive  the  maximum  likelihood  estimates  of  the 
Weibull  parameters . ^  Software  programs  available  at  Picatinny 


^  As  an  aside,  gratitude  is  extended  to  Dr.  Badrig 
Kurkjian  for  introducing  Picatinny  Arsenal  to  the 
Harter  Moore  program  which  has  proved  to  be  invaluable 
in  helping  to  solve  many  engineering  problems. 
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Arsenal,  specifically  in  the  Concepts  and  Effectiveness 
Division,  contributed  greatly  toward  the  successful  evaluation 
of  test  data. 

4.  STATISTICAL  CONTRIBUTION. 

1.  Statistical  probability  techniques  fixed  the 
critical  yaw  angle  for  the  standard  M483  Projectile. 

2.  Statistical  analysis  predicted  the  yaw  angle 
probability  distribution  for  many  modifications  and  for  dif¬ 
ferent  tubes.  These  distributions  provided  the  engineering 
task  team  with  essential  information  for  directing  their 
efforts  toward  projectile  modification. 

3.  a.  For  the  first  time,  probability  design  was 
used  to  predict  projectile  performance  using  a  minimal  nvimber 
of  rounds.  Cost  reduction  and  risk  associated  with  future 
artillery  development  programs  should  follow. 

b.  The  application  of  probability  design  served 
a  twofold  purpose : 

(1)  It  predicted  the  probability  of  exceeding 
a  given  yaw  for  a  specific  design  M483  Projectile. 

(2)  It  afforded  the  engineering  task  team 

a  goal,  in  this  case,  a  13®  critical  yaw  angle;  so  that  their 
efforts  were  directed  toward  achieving  this  goal  in  order  to 
eliminate  short  rounds. 

4.  A  Blue  Ribbon  Panel  especially  assigned  to  over¬ 
view  the  stability  investigation  approved  the  efforts  and 
findings  of  the  investigative  team  and  commended  all  members 
of  the  team  for  their  analysis  of  and  correction  to  the  pro¬ 
jectile  flight  problem.  The  panel  further  stated  that  "in 
the  course  of  this  program  much  has  been  learned  that  is  of 
basic  value  in  the  ballistic  design  and  development  of  project¬ 
iles."  Further,  the  panel  recommended  that  the  "team  can  well 
undertcike  future  new  and  interesting  designs  of  special  shells" 
and  recommended  that  this  project  be  well  dociamented  for  future 
guidance . 

CONCLUDING  REMARKS: 

As  a  result  of  the  program  and  techniques  just  described,  modifications 
1  and  2  were  extensively  tested  at  Nicolet  during  the  winter  of  1975.  Both 
modifications  performed  satisfactorily  as  predicted.  Modification  2  was 
selected  since  it  did  not  result  in  internal  cargo  volume  loss  and  it  was 
recently  released  for  production  as  the  M483A1. 
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PRELIMINARY 
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M483-UK  ZONE  3  (NORMAL  ROUNDS) 
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FIGURE  3 


PRELIMINARY 

14  FEB  74 

M 483- UK  ZONE  3  (SHORT  ROUNDS) 

QE=30* 


MEAN  RANGE  tOBSERVED)"5387IA  MEAN  MUZZLE  VELOCITY  =293.8  M/SEC 

PROSABLE  ERR0R=875M=16.3%R  PROBABLE  ERROR «  0.6  M/SEC 


FAULT  TREE  FOR  Mlt83  PROJECTILE 


FIGURE  5 


5500 


T - 1 - r 


o 

o 

o 

in 


o  o  o 

o  o  o 

m  o  in 

saaiaw  'aDNva 


181 


182 


FIGURE  8 


THE  WEIBULL  DISTRIBUTION  FUNCTION  IS: 
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UEIBUUL  f14B3  5T0  U/0  MUZILE  BRAKE 


1st  MAX,  YAW.  ANGLE 


FIGURE  12 
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WEZBULL  FOR  MOD  I  AND  MOD  II 


FIGURE  13 
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FIGURE  14 

First  Maximum  Yaw  Angle  M483  MOD  2  Flies  Full 
Range  when  Disturbed  up  to  This  Angle 
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FIGURE  15 


5.5®  First  Maximum  Yaw  Angle  -  Standard  M483 
Projectile  Falls  Short  at  This  Angle 
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APPLICATION  OF  LIFE  TESTING  TECHNIQUES  TO  DETECTION  DATA 

Carl  B.  Bates  and  Jerry  Thomas 
Applications  Group 

Methodology  and  Resources  Directorate 
US  Army  Concepts  Analysis  Agency 
Bethesda,  Maryland 

ABSTRACT.  Life  testing  techniques  for  censored  sample  data  are 
discussed.  Singly  and  progressive  censoring  of  type  I  and  type  II  are 
defined.  The  detection  phenomenon  involving  observers  not  always  detecting 
targets  is  placed  in  the  framework  of  progressively  censored  sampling. 

Maximum  likelihood  estimates  for  the  parameters  of  the  two-parameter 
Wei  bull  distribution  are  given,  and  a  test  statistic  is  presented  for 
comparing  two  Wei bull  distributions  fitted  to  censored  sample  data. 

Weibull  distributions  of  sample  sizes  500,  250,  and  100  having  0,  10,  and 
20  percent  censored  are  simulated.  The  shape  parameter  is  varied  over  the 
range  1.0  to  3.5  and  equality  of  pairs  of  the  distributions  is  tested. 

The  relationships  between  Beta  and  the  Beta  difference  that  is  distinguishable 
are  given  for  each  of  the  three  sample  sizes.  For  the  largest  sample 
size,  at  the  0. 5-level  of  significance,  the  Beta  difference  that  is 
distinguishable  varied  from  0.15  for  small  shape  parameter  values  to 
0.38  for  large  shape  parameter  values.  For  the  100  sample  size  distribu¬ 
tions,  the  Beta  difference  distinguishable  varied  from  0.30  to  0.73. 
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I.  INTRODUCTION 

The  detection,  identification,  and  localization  of  enemy  targets 
is  an  integral  part  of  many  US  Army  studies.  These  studies  may  be 
classified  into  either  computer  simulated  experimentation  or  field 
conducted  experimentation.  Field  experimentation  involving  the  detec¬ 
tion  process  is  usually  performed  to  estimate  or  compare  the  effective¬ 
ness  of  materiel  or  methods  of  employment.  Often  empirical  data  from 
the  field  experimentation  is  then  used  as  input  to  computer  simulation 
models,  or  the  analysis  results  of  the  empirical  data  are  used  to 
provide  the  basis  of  simulating  detection  in  computer  simulation  models. 

Because  of  the  "no  detections"  (observers  not  detecting  exposed 
targets)  which  occur  in  field  experimentation  involving  detection 
processes,  the  analysis  of  empirical  detection  data  presents  unique 
problems.  In  the  sections  which  follow,  the  analysis  problems  are 
discussed  and  a  proposed  analysis  methodology  is  presented  and  illustrated. 

II.  PROBLEM  DESCRIPTION  AND  BACKGROUND 

A.  Problem  Description 

A  field  experiment  involving  candidate  land  combat  systems  is 
designed  and  conducted.  One  of  the  many  measures  of  effectiveness  of 
the  systems  is  detection  time.  During  the  conduct  of  the  experiment, 
however,  the  systems  do  not  always  detect  exposed  enemy  targets.  There¬ 
fore,  detection  time  data  is  not  collected  for  all  of  the  planned  trials 
of  the  field  experiment.  Consequently,  the  original  orthogonal  design 
for  the  experiment  is  nonorthoqonal  with  respect  to  the  response  variable. 


196 


detection  time.  The  objective  of  this  report  is  to  present  a  method  of 
analysis  which  uses  both  the  detection  times  of  detected  targets  and  the 
exposure  times  of  undetected  targets. 

B.  Background 

Land  combat  experimentation  involving  the  detection  of  targets 
invariably  results  in  targets  not  being  detected  for  some  of  the  experi¬ 
mental  trials,  e.g.,  Caviness  et  a1 .  (1972)  and  McKinney  et  al .  (1971) 
and  (1972).  Treating  the  "no  detect"  trials  as  missing  values  and  apply¬ 
ing  one  of  the  statistical  techniques  for  estimating  missing  values  does 
not  have  appeal  because  it  does  not  utilize  all  the  available  information 
from  the  experimental  data,  namely,  the  duration  of  the  time  that  line-of- 
sight  existed  between  the  observer  and  the  target.  Ignoring  the  no  detect 
trials  and  analyzing  only  the  data  from  trials  for  which  a  detection  did 
occur  does  not  have  appeal  for  the  same  reason.  Moreover,  analyses  based 
on  all  available  experimental  data  addresses  the  unconditional  detection 
probability  of  interest,  whereas  analyses  based  on  only  trials  for  which 
a  detection  did  occur  addresses  the  conditional  probability  of  detection, 
given  a  detection  has  occurred. 

A  search  for  a  proper  method  of  analysis  of  the  detection  times 
which  would  utilize  the  target  exposure  times  of  the  no  detect  trials  led 
to  the  area  of  life  testing.  It  was  concluded  that  the  detection 
phenomenon  when  all  targets  are  not  detected  is  similar  to  the  censored 
sample  situation  in  life  testing. 
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III.  LIFE  TESTING 


In  life  testing  a  number  (N)  of  components  are  tested  and  the  time 
to  a  component's  failure  is  recorded.  If  components  are  withdrawn  from 
the  test  before  failure  (in  our  case  a  target  passes  from  an  exposed 
state  to  a  concealed  state  without  being  detected)  the  sample  is  termed 
censored.  Censoring  may  be  of  two  types: 

1.  Type  I  -  in  which  at  some  predetermined  fixed  time,  say  tg, 

testing  is  terminated,  or 

2.  Type  II  -  in  which  after  some  predetermined  fixed  number, 

say  n,  of  sample  items  fail,  testing  is  terminated. 

With  each  type  of  censoring,  the  collected  data  consists  of  the  n  failure 

times  t^,  t^ . t^,  plus  the  information  that  the  remaining  (N-n)  items 

survived  beyond  the  time  of  termination,  t^  for  Type  I  and  t^^  for  Type  II. 

The  above  described  censoring  is  termed  singly  censored  samples.  If, 
however,  the  initial  censoring  results  in  withdrawal  of  only  a  portion 
of  the  surviving  items,  with  some  remaining  under  test  until  ultimate 
failure  or  until  a  subsequent  stage  of  censoring  is  performed,  we  have 
progressively  (multiple)  censored  samples.  In  general  then  censoring 
occurring  progressively  in  k  stages  at  times  T^;  i=l,2,...,k,  and  at 
each  i^h  stage  of  censoring  r^  sample  items  are  selected  randomly  from 
the  survivals  at  time  T^  and  removed  (that  is,  censored)  from  further 
observation.  This  is  analogous  to  our  detection  phenomenon.  We  have  a 
target  coming  from  a  concealed  state  to  an  exposed  state  just  as  a  test 
item  starting  under  observation  during  test.  If,  however,  a  target 
passes  from  an  exposed  state  back  to  a  concealed  state  without  being 
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detected,  it  is  removed  from  further  observation  at  a  time  T.  (equal  to 
the  target's  total  exposure  time).  Further,  in  our  case  each  of  the  k  r^ 
equal  one  because  in  general  the  exposure  times  of  any  two  or  more  unde¬ 
tected  targets  are  not  identical. 

Past  experience  has  shown  a  positive  skewness  in  the  empirical  data 
distributions  of  time  variables  associated  with  the  target  detection 
process.  Bates  (1971)  and  McKinney  et  al.  (1971)  and  (1972).  Moreover, 
in  McKinney  et  al .  (1972)  it  was  shown  that  the  two-parameter  Ueibull 
distribution  gave  adequate  approximations  to  detection  time  sample  dis¬ 
tributions.  In  the  probability  density  function  (pdf)  of  the  two-param¬ 
eter  Wei  bull  distribution, 

f(x)  =  (B/a^)x®'^  exp[-(x/a)^];  X  >  0,  a  >  0,  e  >  0,  (1) 

a  is  the  scale  parameter  and  g  is  the  shape  parameter. 

The  Weibull  distribution  provides  considerable  flexibility  for 
approximating  a  variety  of  distributions.  When  g  =  1  we  have  the  exponen¬ 
tial  distribution  and  when  g  =  3.5  we  have  a  distribution  very  close  to 
the  normal  distribution.  In  FIGURE  1  on  the  next  page,  the  Weibull  pdf 
is  shown  for  three  different  shape  parameters.  The  middle  curve  is  a 
positively  skewed  distribution  similar  to  that  of  our  target  detection 
times. 
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FIGURE  1,  Weibull  Probability  Density  Function 


The  flexibility  of  the  Weibull  distribution  can  be  further 
illustrated  in  terms  of  the  cumulative  distribution  function  (cdf). 

In  the  context  of  our  detection  problem,  the  cdf  F(x^.)  is  the  probability 
of  detection  by  time  .  FIGURE  2a  is  an  S  shaped  cdf  similar  to  that  of 
a  normal  distribution.  FIGURE  2b  illustrates  the  cdf  of  a  Weibull 
distribution  having  the  same  shape  parameter  as  the  distribution  in 
FIGURE  2a,  but  a  larger  scale  parameter.  FIGURE  2c  has  the  same  scale 
parameter  as  FIGURE  2a,  but  a  smaller  shape  parameter. 
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IV.  ESTIMATION 


The  first  step  in  the  analysis  process  is  the  approximation  of  the 
distribution  of  target  detection  times.  This  involves  estimating  the 
two  parameters,  a  and  e,  of  equation  (1).  Substituting  a  and  3  for 

A 

a  and  3  in  f(x)  gives  the  approximation  distribution,  f{x),  of  target 
detection  times.  The  estimation  technique  which  is  employed  evolved 
from  life  testing. 

Cohen  (1963)  shows  that  although  intermediate  steps  in  the  deriva¬ 
tions  differ,  the  maximum  likelihood  estimation  equations  for  Type  I 
and  Type  II  progressively  censored  samples  yield  the  same  end  result. 

The  maximum  likelihood  estimation  equations  for  the  two-parameter  Wei  bull 
distribution  are  given  in  Cohen  (1965).  The  equations  are  nonlinear  in 
the  parameters  and  must,  therefore,  be  solved  by  iterative  procedures. 

He  solves  the  expression, 

A  ^  A 

[(E*x.ln  x^/E  x-)-(l/3)]  =  (1/0)2  In  x^  (2) 


for  3.  The  asterisk  denotes  that  the  summation  is  over  the  entire  sample 
with  the  r^  observations  censored  at  time  T^  assigned  the  value  x-j  =  T^. 
Then,  substituting  3  obtained  from  equation  (2)  into  the  other  maximum 
likelihood  estimation  equation,  9ln  L/3a,  and  solving  for  a  he  gets 


A 

a 


(z*xf/n)n/«) 


(3) 


where  In  L  is  the  logarithm  Of  the  likelihood  function.  Substitution 
of  the  two  obtained  parameter  estimates,  a  and  3,  into  equation  (1) 
yields  the  desired  approximation  distribution  f(x). 
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The  mean  of  f(x)  is 


E(x)  =  S  r(l  +  1/6),  (4) 

and  the  approximate  variance  is 

V(x)  =  (3f/8a)^V(a)  +  (3f/36)^V(e)  +  2(3f/3a,)  ( 3f/36)Cov(a,6) .  (5) 

V.  HYPOTHESIS  TESTING 

Suppose  that  in  a  field  experiment  two  candidate  detection  devices  are 
under  study.  One  of  the  primary  objectives  of  the  experiment  is  to  com¬ 
pare  the  detection  distributions  of  the  two  devices  and  make  inferences 
concerning  the  equality  of  the  two  populations.  After  applying  the 
estimation  techniques  in  the  previous  section  to  the  empirical  detection 
data  collected  on  the  performance  of  the  two  devices  to  approximate  the 
distribution  for  each  device,  we  are  now  interested  in  comparing  these  two 
distributions.  Specifically  the  null  hypothesis. 


is  tested  against  the  two  sided  alternative  hypothesis. 
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The  test  statistic  for  testing  the  null  hypothesis  against  the 


alternative  hypothesis  is  Q,  where 


and  where  the  variance-covariance  matrix  is 


(8) 


a  (&)  0{&.8)' 

0  (a,e)  o^(e) , 


v(S,)  +  V(S„-)  cov(a  i  )  +  Cov(a  .e  j 

1  11  2  2 

Cov(S  ,e  )  +  cov(S  ,e  )  v(6  )  +  v(6  ) 

11  2  2  1  2  ■* 


Equation  (8)  is  a  quadratic  form  and  is  approximately  distributed 
as  a  Chi-square  variate  with  two  degrees  of  freedom,  see  for  example, 
Mood  (1950),  Rao  (1952),  or  Wilks  (1962).  That  is. 


Q  ~  x^(2) . 


(10) 


An  inspection  of  equation  (8)  shows  that  close  agreement  between  the 
two  distributions  yields  a  small  statistic,  while  a  large  difference 
between  the  two  yields  a  large  statist-ic.  Therefore,  the  critical  region 
of  the  test  is  the  upper  tail  of  the  x^-distribution.  Consequently,  to 
test  the  null  hypothesis  of  equation  (6),  compare  Q  with  x^(l-ai2).  If 
Q  1  x^(1"a»2),  reject  the  null  hypothesis  at  the  o-level  of  significance; 
otherwise  do  not  reject  the  null  hypothesis.  By  rejecting  the  null 
hypothesis,  we  are  saying  that  the  two  detection  distributions  are  not 
equal . 
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VI.  TEST  DISCRIMINATION 


A.  General 

In  the  previous  section  it  was  seen  that  the  determination  of  a 
difference  between  distributions  is  dependent  upon  the  scale  parameter, 
a,  and  the  shape  parameter,  6.  For  this  study  it  was  decided  to  set  a 
equal  to  25  and  concentrate  our  efforts  on  the  shape  parameter,  6,  When 
6=1,  the  Weibull  distribution  is  equivalent  to  the  exponential  distri¬ 
bution  and  when  6  =  3.5,  the  distribution  is  approximately  normal.  Since 
the  shape  of  the  detection  distribution  is  expected  to  be  within  this 
range,  shape  parameter  values  between  1.0  and  3.5  are  studied. 

B.  Sample  size  of  500 

Test  performance  in  application  can  be  no  better  than  the 
asymptotic  power  of  the  test.  Because  no  information  is  available  on  the 
power  of  the  test,  an  initial  sensitivity  analysis  is  performed.  Conse¬ 
quently,  large  samples  having  a  moderate  amount  of  censoring  are  first 
studied. 

Weibull  distributions  of  sample  size  500  having  three  different 
percentages  of  censoring  (0,  10,  and  20)  were  generated  by  Monte  Carlo 
simulation.  The  scale  parameter  was  arbitrarily  fixed  at  a  =  25.  The 
range  of  the  shape  parameter  values  (1.0  to  3.5)  was  divided  into  five 
sub-ranges  of  length  0.5  each.  Within  each  sub-range  B  was  incremented 
in  steps  of  0.1  to  give  six  B-values,  e.a.,  (1.0,  1,1,  1.2,  1.3,  1.4,  1.5), 
(1.5,  1.6,  1.7,  1.8,  1.9,  2.0),  ...,  (3.0,  3.1,  3.2,  3.3,  3,4,  3.5),  For 
each  of  the  six  B-values,  a  Weibull  distribution  was  generated  for  each 
of  the  three  percentages  censo'^ed.  This  gave  eighteen  distributions 
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for  each  of  the  five  3-value  sub-ranges  or  a  total  of  153  pair-wise 
comparisons.  For  completeness  and  anticipated  follow-on  analyses, 
summary  statistics  are  tabulated  in  APPENDIX  A.  TABLES  A-1  through  A-5 
contain  the  five  sets  of  summary  statistics  of  the  eighteen  distributions. 

Within  each  set  of  eighteen  distributions,  all  possible  (153)  compari¬ 
sons  were  made  between  pairs  of  distributions.  That  is,  the  null 
hypothesis  of  equality  of  the  two  distributions,  equation  (6),  was  tested. 
This  gave  153  Q-statistics.  The  corresponding  3  differences 
i=l,2,...,17;j=2,3,...,18)  were  calculated  and  paired  with  the  Q-statistics. 
Within  each  set  of  3  differences  and  Q-statistics,  six  different  combina¬ 
tions  existed  between  the  percentages*  censored  in  the  two  distributions 
being  compared- (0,0) ,  (0,10),  (0,20),  (10,10),  (10,20),  and  (20,20).  The 
distribution  of  the  153  cases  over  the  six  combinations  is  shown  in 
TABLE  1  below. 


TABLE  1 

CENSORING  DISTRIBUTION 

Combination 

Number 

Percentage  Censored 
(Sample  j.  Sample  i) 

Number  of 
Samples 

1 

(0,0) 

15 

2 

(0,10)  or  (10,0) 

36 

3 

(0,20)  or  (20,0) 

36 

4 

(10,10) 

15 

5 

(10,20)  or  (20,10) 

36 

6 

(20,20) 

15 
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The  theoretical  relationship  between  the  e  differences  and  Q  is 
parabolic.  Therefore,  a  quadratic  in  e  differences  was  fitted  for  each 
of  the  six  combinations  in  TABLE  1,  using  §  differences  as  the  independent 
variable  and  the  Q-statistic  as  the  dependent  variable.  Within  each  of 
the  five  B-value  sub-ranges,  the  quadratic  fit  for  each  of  the  six  censor¬ 
ing  combinations  was  evaluated  for  Q  =  5.991,  the  critical  value  for 
the  0.05-level  of  significance.  This  gives  the  difference  between  the 
shape  parameters  of  two  distributions  which  would  be  declared  significant 
at  the  0.05-level  of  significance.  The  largest  variation  among  each  set 
of  the  six  I  differences  was  0.04.  This  is  well  within  the  variability 
of  the  generated  data.  The  six  combinations  of  each  B-value  sub-range 
were  then  "pooled"  and  a  quadratic  fit  was  made  to  each  of  the  five  sub¬ 
ranges  of  the  153  §  differences.  All  fits  were  "good";  the  coefficients 
of  determination  ranged  from  0.90  to  0.97.  Each  of  the  five  sub-range 
quadratic  regression  equations  was  then  evaluated  for  two  levels  of  signi¬ 
ficance  (0.05  and  0.01)  or  Q  =  5.991  and  Q  =  9.210.  The  resulting  relation¬ 
ship  between  B  and  the  3  differences  detectable  for  the  two  significance 
levels  is  graphically  illustrated  in  FIGURE  3  on  the  following  page. 

FIGURE  3  suggests  a  strong  linear  relationship  between  e  and  the  B 
difference  that  is  detectable.  In  fact,  the  ratio  of  the  plotted  b 
differences  over  their  respective  sub-range  mid-points  is  nearly  constant 
for  each  level  of  significance.  At  the  0.05-level  of  significance,  the 
ratio  is  approximately  0.12:  at  the  0.01-level  of  significance,  it  is 
approximately  0.15. 
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BETA  DIFFERENCE 


FIGURES  A-1  through  A-5  of  APPENDIX  A  pictorially  illustrate  typical 
distributions,  within  each  of  the  five  sub-ranges,  which  are  statistic¬ 
ally  different  when  equality  is  tested  at  the  0.05-level  of  significance. 
Each  of  the  five  figures  contains  a  plot  of  two  distributions,  taken 
from  the  samples  shown  in  TABLES  A-1  through  A-5,  respectively.  The 
distribution  having  the  smaller  shape  parameter  is  drawn  with  a  solid 
line  and  its  shape  parameter  estimate  is  denoted  by  the  distribu¬ 
tion  having  the  larger  shape  parameter  is  shown  with  a  dashed  line  and 
its  shape  parameter  is  denoted  by  For  example  in  FIGURE  A-1, 
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samples  were  selected  from  a  distribution  with  3  =  1.1  (with  no  censoring) 

and  with  3  =  1.2  (with  10%  censoring);  and  the  two  sample  estimates  of 

the  shape  parameter  are  3i  =  1.062  and  3^  =  1.227.  The  Q-statistic  for 

testing  the  null  hypothesis  of  equation  (6)  is  also  given  on  each  figure. 

2 

In  each  case,  the  Q-statistic  is  between  x  (0.95,2)  =  5.991  and 
2 

X  (0,99,2)  =  9.210.  That  is,  the  level  of  significance  at  which  the  null 
hypothesis  would  be  rejected  is  between  0.05  and  0.01.  The  five  figures 
illustrate  the  test  discrimination  between  distributions  of  different 
shapes  over  a  range  of  shape  parameter  values  from  1.0  to  3.5. 

C.  Sample  Size  of  250 

In  practice  large  samples  are  often  not  available.  Therefore, 
test  performance  for  two  smaller  samples  (N  =  250  and  N  =  100)  are  studied. 
The  results  for  N  =  250  are  presented  first. 

Weibull  distributions  of  sample  size  250  were  generated.  The 
same  scale  and  shape  parameters  and  the  same  percentages  of  censoring 
were  used  as  for  N  =  500.  The  procedure  described  in  Section  A  above  was 
repeated  using  N  =  250.  The  summary  statistics  are  given  in  TABLES  B-1 
through  B-5  of  APPENDIX  B.  This  time  the  largest  variation  among  each 
set  of  the  six  3  differences  was  0.07.  Again,  this  variation  is  within 
the  variability  of  the  generated  data.  The  Beta  differences  obtained 
from  the  evaluations  of  the  five  quadratic  regression  equations  are 
given  in  TABLE  2  below.  As  before,  there  appears  to  be  a  linear  relation¬ 
ship  between  3  and  the  3  difference  that  is  detectable. 
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TABLE  2 


BETA 

DIFFERENCES 

FOR  N  = 

250 

Significance 

Level 

Beta 

LT) 

r-“ 

1 

O 

I— 

1.5-2. O' 

2.0-2. 5 

2. 5-3.0 

3. 0-3. 5 

0.05 

0.20 

0.29 

0.37 

0.45 

0.54 

0.01 

0.24 

0.36 

0.47 

0.55 

0.67 

The  test  discrimination  for  each  of  the  five  sub-ranges  is  illustrated 
in  FIGURES  B-1  through  B-5  in  APPENDIX  B.  The  notation  in  the  figures 

is  the  same  as  that  described  in  the  previous  section.  The  distribution 

having  the  smaller  shape  parameter  estimate  is  denoted  by  and  the  larger 
is  denoted  by  The  significance  level  of  each  pair  of  illustrated 
distributions  is  between  0.05  and  0.01.  The  Q-statistic  is  again  given 
on  each  of  the  five  figures. 

D.  Sample  Size  of  100 

In  the  examination  of  the  test  performance  for  N  =  100,  the  sub¬ 

ranges  of  the  shape  parameter  values  had  to  be  reconstructed.  This  was 
because  the  Beta  difference  which  is  distinguishable  is  larger  than  0.5 
for  shape  parameters  greater  than  1.5.  Therefore,  the  shape  parameter 
range  was  divided  into  three  sub-ranges  rather  than  the  five  previously 
used.  The  three  sub-ranges  were  1.0-1. 5,  1.5-2. 5,  and  2. 5-3. 5.  Within 
the  first  sub-range,  B  was  incremented  in  steps  of  0.1  as  before.  But 
within  the  two  larger  sub-ranges,  6  was  incremented  in  steps  of  0.2. 

This  gave  six  6-values  for  each  of  the  three  sub-ranges.  The  sunmary 
statistics  of  the  three  sets  of  eighteen  distributions  are  given  in 
TABLES  C-1,  C-2,  and  C-3. 
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The  largest  variation  among  each  set  of  the  six  e  differences  was  0.08, 
again  within  the  variability  of  the  data.  The  Beta  differences  from  the 
three  quadratic  regressions  are  given  in  TABLE  3.  Test  discrimination  is 
pictorially  illustrated  in  the  three  figures  of  Appendix  C. 

TABLE  3 

BETA  DIFFERENCES  FOR  N  =  100 


Significance 

Level 


Beta 

r.0-1.5 - T.3-2'.5 - 2. 5-3. 5 


0.05  0.30  0.48  0.73 

0.01  0.37  0.57  0.89 


The  test  discrimination  for  all  three  sample  sizes  is  shown  in 
FIGURE  4.  All  three  sample  sizes  exhibit  a  linear  relationship  between  6 
and  the  3  difference  that  is  detectable.  As  expected,  the  3  difference 
that  is  detectable  is  smaller  for  large  sample  sizes  than  the  3  difference 
that  is  detectable  for  small  sample  sizes.  The  dependence  of i the  3 
difference  that  is  detectable  upon  3  is  greater  for  small  sample  sizes 
than  it  is  for  large  sample  sizes.  The  trend  of  the  lines  for  N  =  100 
has  the  steepest  slope. 

VII.  CONCLUSIONS 

The  test  statistic  performed  satisfactorily  over  the  range  of  shape 
parameters  and  the  percentages  of  censoring  investigated.  For  the  three 
sample  sizes  and  the  parameter  values  studied,  test  discrimination  is  not 
degraded  when  censoring  does  not  exceed  twenty  percent  of  the  sample  size. 
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BETA  DIFFERENCE 


1.0  1.5  2.0  2.5  3.0  3.5 

FIGURE  4,  TEST  DISCRIMINATION 
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Therefore,  under  moderate  degrees  of  censoring,  the  Q-statistic 
provides  a  useful  test  statistic  for  testing  the  equality  of  two  fitted 
Weibull  distributions.  The  relationships  shown  in  FIGURE  4  between  B 
and  the  e  differences  that  are  distinguishable  can  serve  as  indicators 
of  test  discrimination.  These  indicators  should  be  of  value  when  design 
ing  target  detection  experimentation  and  when  analyzing  target  detection 
data  in  which  all  exposed  targets  are  not  detected. 
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APPENDIX  A 
SAMPLE  SIZE  OF  500 
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TABLE  A-1 

1 ,  N  =  ! 

500  and  Shape 

Parameter  Equal 

KO  -  K5 

Sample 

Percent 

A 

A 

Number 

Censored 

a 

A 

1, 

i 

Ual 

V(x) 

1 

0 

25 

24.750 

1.0 

1.053 

24.251 

530.968 

2 

10 

25 

27.050 

1.0 

1.028 

26.747 

677.018 

3 

20 

25 

30.912 

1.0 

1.023 

30.625 

896.403 

4 

0 

25 

25.219 

1.1 

1.062 

24.629 

538.161 

5 

10 

25 

26.285 

1.1 

1.098 

25.379 

535.669 

6 

20 

25 

27.943 

1.1 

1.219 

26.181 

465.978 

7 

0 

25 

25.637 

1.2 

1.188 

24.179 

417.413 

8 

10 

25 

26.702 

1.2 

1.227 

24.979 

418.800 

9 

20 

25 

27.661 

1.2 

1.123 

26.515 

559.533 

10 

0 

25 

25.975 

1.3 

1.279 

24.071 

359.644 

11 

10 

25 

25.717 

1.3 

1.312 

23.708 

332.413 

12 

20 

25 

27.895 

1.3 

1.346 

25.592 

369.171 

13 

0 

25 

24.131 

1.4 

1.461 

21.857 

231.181 

14 

10 

25 

25.234 

1.4 

1.518 

22.747 

233.317 

15 

20 

25 

25.710 

1.4 

1.388 

23.466 

293.222 

16 

0 

25 

25.669 

1.5 

1.502 

23.169 

246.870 

17 

10 

25 

25.341 

1.5 

1.427 

23.029 

268.047 

18 

20 

25 

28.123 

1.5 

1.473 

25.445 

308.610 
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TABLE  A-2,  N  =  500  and  Shape  Parameter  Equal  1.5  -  2.0 


Sample 

Number 

Percent 

Censored 

SL 

A 

6 

A 

3 

E(x) 

m. 

1 

0 

25 

24.023 

1.5 

1.599 

21.540 

190.289 

2 

10 

25 

26.163 

1.5 

1.546 

23.537 

241 . 584 

3 

20 

25 

27.159 

1.5 

1.560 

24.410 

255.493 

4 

0 

25 

25.947 

1.6 

1.585 

23.284 

225.728 

5 

10 

25 

26.201 

1.6 

1.763 

23.326 

186.667 

6 

20 

25 

24.382 

1.6 

1.590 

21.874 

198.292 

7 

0 

25 

24.338 

1.7 

1.700 

21.716 

172.883 

8 

10 

25 

24.927 

1.7 

1.775 

22.183 

166.842 

9 

20 

,  25 

25.551 

1.7 

1.836 

22.701 

164.275 

10 

0 

25 

24.856 

1.8 

1.839 

22.083 

154.996 

11 

10 

25 

26.119 

1 .8 

1.795 

23.231 

179.268 

12 

20 

25 

28.095 

1.8 

1.934 

24.917 

180.238 

13 

0 

25 

24.585 

1.9 

1.899 

21.816 

142.759 

14 

10 

25 

25.649 

1.9 

1,769 

22.830 

177.822 

15 

20 

25 

26.921 

1.9 

1.842 

23.916 

181.228 

16 

0 

25 

24.507 

2.0 

1.945 

21.732 

135.675 

17 

10 

25 

25.258 

2.0 

1.954 

22.396 

142.935 

18 

20 

25 

26.617 

2.0 

2.019 

23.585 

149.435 
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TABLE  A-3,  N  =  500  and  Shape  Parameter  Equal  2.0  -  2.5 


Sampl e 
Number 

Percent 

Censored 

SL 

A 

OL 

A 

B 

E(x) 

V(x) 

1 

0 

25 

25.833 

2.0 

2.050 

22.885 

136.899 

2 

10 

25 

25.150 

2.0 

2.039 

22.282 

131.058 

3 

20 

25 

25.268 

2.0 

1.995 

23.280 

148.812 

4 

0 

25 

25.608 

2.1 

2.069 

22.683 

132.273 

5 

10 

25 

25.918 

2.1 

2.003 

22.968 

143.820 

6 

20 

25 

25.970 

2.1 

2.096 

23.002 

132.933 

7 

0 

25 

25.965 

2.2 

2.372 

23.012 

106.574 

8 

10 

25 

25.956 

2.2 

2.253 

22.990 

116.586 

9 

20 

25 

25.952 

2.2 

2.317 

22.993 

110.950 

10 

0 

25 

25.530 

2.3 

2.281 

22.615 

110.349 

11 

10 

25 

25.119 

2.3 

2.270 

22.251 

107.766 

12 

20 

25 

26.443 

2.3 

2.387 

23.439 

109.267 

13 

0 

25 

24.427 

2.4 

2.329 

21.643 

97.346 

14 

10 

25 

25.088 

2.4 

2.399 

22.240 

97.532 

15 

20 

25 

26.236 

2.4 

2.577 

23.298 

94.108 

16 

0 

25 

24.550 

2.5 

2.614 

21.809 

80.384 

17 

10 

25 

24.814 

2.5 

2.478 

22.012 

90.150 

18 

20 

25 

26.159 

2.5 

2.585 

23.231 

93.100 
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jarlf  A-4,  N  =  500  and  Shape  Parameter  Equal  2.5  -  3.0 


Sample 

Number 

Percent 

Censored 

a 

A 

a 

i 

A 

i 

E(x) 

V(x) 

1 

0 

25 

24.657 

2.5 

2.573 

21.894 

83.374 

2 

10 

25 

25.261 

2.5 

2.688 

22.461 

81.143 

3 

20 

25 

26.613 

2.5 

2.690 

23.664 

89.922 

4 

0 

25 

24.947 

2.6 

2.637 

22.168 

81.790 

5 

10 

25 

25.938 

2.6 

2.607 

23.040 

90.202 

6 

20 

25 

25.906 

2.6 

2.786 

23.064 

80.241 

7 

0 

25 

23.718 

2.7 

2.571 

21.060 

77.219 

8 

10 

25 

25.821 

2.7 

2.669 

22.953 

85.816 

9 

20 

25 

25.625 

2.7 

2.705 

22.789 

82.562 

10 

0 

25 

25.040 

2.8 

2.750 

22.282 

76.633 

11 

10 

25 

25.372 

2.8 

3.100 

22.690 

64.130 

12 

20 

25 

25.394 

2.8 

2.975 

22.668 

68.911 

13 

0 

25 

25.166 

2.9 

2.856 

22.426 

72.552 

14 

10 

25 

26.543 

2.9 

2.988 

23.698 

74.736 

15 

20 

25 

25.832 

2.9 

3.054 

23.085 

68.194 

16 

0 

25 

25.034 

3.0 

2.804 

22.292 

74.090 

17 

10 

25 

25.906 

3.0 

2.958 

23.120 

72.396 

18 

20 

25 

26.220 

3.0 

2.836 

23.359 

79.728 
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TARLF  A-5,  N  =  500  and  Shape  Parameter  Equal  3.0  -  3.5 


Sampl e 
Number 

Percent 

Censored 

a 

A 

£ 

i 

A 

i 

E(x) 

Vixl 

1 

0 

25 

24.828 

3.0 

3.100 

22.204 

61.431 

2 

10 

25 

25.701 

3.0 

3.144 

23.000 

64.230 

3 

20 

25 

26.197 

3.0 

3.067 

23.417 

69.606 

4 

0 

25 

23.978 

3.1 

3.007 

21.414 

60.328 

5 

10 

25 

24.927 

3.1 

3.175 

22.317 

59.436 

6 

20 

25 

25.758 

3.1 

3.136 

23.048 

64.801 

7 

0 

25 

25.062 

3.2 

3.288 

22.477 

56.625 

8 

10 

25 

25.815 

3.2 

3.176 

23.113 

63.718 

9 

20 

25 

24.788 

3.2 

3.197 

22.200 

58.088 

10 

0 

25 

25.489 

3.3 

3.251 

22.847 

59.696 

11 

10 

25 

25.632 

3.3 

3.181 

22.950 

62.642 

12 

20 

25 

25.384 

3.3 

3.410 

22.808 

54.600 

13 

0 

25 

25.141 

3.4 

3.369 

22.575 

54.652 

14 

10 

25 

25.163 

3.4 

3.673 

22.699 

47.311 

15 

20 

25 

25.401 

3.4 

3.457 

22.840 

53.421 

16 

0 

25 

24.879 

3.5 

3.437 

22.363 

51.742 

17 

10 

25 

24.816 

3.5 

3.525 

22.337 

49.316 

18 

20 

25 

25.355 

3.5 

3.674 

22.873 

48.005 
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(X) 

FIGURE  A-1 ,  N  =  500  and  Beta  Between  1.0  and  1.5 


(X) 

FIGURE  A-2,  N  =  500  and  Beta  Between  1.5  and  2.0 
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(X) 

FIGURE  A-3,  N  =  500  and  Beta  Between  2.0  and  2.5 
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(X) 

FIGURE  A-4,  N  =  500  and  Beta  Between  2.5  and  3.0 
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(X) 

FIGURE  A-5,  N  =  500  and  Beta  Between  3.0  and  3.5 
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APPENDIX  B 


SAMPLE  SIZE  OF  250 


TABLE  B-1 ,  N  =  250  and  Shape  Parameter  Equal  1.0  -  1.5 


Sample 

Percent 

Number 

Censored 

A 

1 

0 

25 

23.762 

2 

10 

25 

26.378 

3 

20 

25 

29.632 

4 

0 

25 

25.476 

5 

10 

25 

29.179 

6 

20 

25 

26.487 

7 

0 

25 

24.300 

8 

10 

25 

27.358 

9 

20 

25 

28.148 

10 

0 

25 

25.737 

11 

10 

25 

26.655 

12 

20 

25 

23.056 

13 

0 

25 

27.990 

14 

10 

25 

24.890 

15 

20 

25 

27.801 

16 

0 

25 

22.362 

17 

10 

25 

26.176 

18 

20 

25 

25.832 

i 

i\ 

i 

m. 

1.0 

1.020 

23.573 

534.632 

1.0 

1.127 

25.258 

504.394 

1.0 

1.086 

28.718 

700.942 

1.1 

1.134 

24.347 

462.878 

1.1 

1.170 

27.631 

560.984 

1.1 

1.158 

25.158 

474.698 

1.2 

1.286 

22.494 

310.885 

1.2 

1.339 

25.124 

359.439 

1.2 

1.256 

26.186 

440.023 

1.3 

1.393 

23.476 

291.445 

1.3 

1.390 

24.322 

314.016 

1.3 

1.258 

21.443 

294.347 

1.4 

1.413 

25.474 

334.136 

1.4 

1.392 

22.706 

272.956 

1.4 

1.384 

25.384 

344.597 

1.5 

1.482 

20.218 

192.794 

1.5 

1.490 

23.651 

261.168 

1.5 

1.522 

23.279 

243.071 
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TABLE  B-2,  N  =  250  and  Shape  Parameter  Equal  1.5  -  2.0 


Sample 

Number 

Percent 

Censored 

a 

A 

1 

0 

25 

24.720 

2 

10 

25 

26.497 

3 

20 

25 

26.613 

4 

0 

25 

26.040 

5 

10 

25 

26.950 

6 

20 

25 

23.984 

7 

0 

25 

23.940 

8 

10 

25 

26.037 

9 

20 

25 

27.766 

10 

0 

25 

25.325 

11 

10 

25 

25.188 

12 

20 

25 

26.896 

13 

0 

25 

26.207 

14 

10 

25 

26.284 

15 

20 

25 

26.581 

16 

0 

25 

26.000 

17 

10 

25 

25.068 

18 

20 

25 

26.476 

i 

i 

E(x) 

V(x) 

1.5 

1.693 

22.062 

179.775 

1.5 

1.527 

23.870 

254.103 

1.5 

1.514 

23.998 

260.989 

1.6 

1.774 

23.174 

182.153 

1.6 

1.776 

23.982 

194.670 

1.6 

1.603 

21.500 

188.652 

1.7 

1.634 

21.424 

180.782 

1.7 

1.752 

23.187 

186.544 

1.7 

1.658 

24.819 

236.237 

1.8 

1.818 

22.511 

164.492 

1.8 

1.742 

22.440 

176.679 

1.8 

1.789 

23.926 

191.213 

1.9 

2.020 

23.222 

144.716 

1.9 

1.881 

23.331 

166.163 

1.9 

1.836 

23.617 

177.900 

2.0 

2.015 

23.039 

143.155 

2.0 

1.876 

22.254 

151.880 

2.0 

2.021 

23.460 

147.584 
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TABLE  B-3,  N  =  250  and  Shape  Parameter  Equal  2.0  -  2.5 


Sampi e 

Percent 

Number 

Censored 

a 

A 

a 

1 

0 

25 

24.476 

2 

10 

25 

26.505 

3 

20 

25 

27.163 

4 

0 

25 

25.263 

5 

10 

25 

26.646 

6 

20 

25 

26.016 

7 

0 

25 

22.595 

8 

10 

25 

25.165 

9 

20 

25 

11. in 

10 

0 

25 

25.664 

11 

10 

25 

25.207 

12 

20 

25 

26.006 

13 

0 

25 

24.950 

14 

10 

25 

26.426 

15 

20 

25 

25.957 

16 

0 

25 

25.662 

17 

10 

25 

26.282 

18 

20 

25 

26.208 

i 

A 

3 

E(x) 

V(x) 

2.0 

2.104 

21.678 

117.223 

2.0 

2.147 

23.473 

132.560 

2.0 

2.020 

24.068 

155.524 

2.1 

2.184 

22.373 

116.729 

2.1 

2.071 

23.602 

142.957 

2.1 

2.276 

23.045 

115.090 

2.2 

2.079 

20.014 

102.081 

2.2 

2.082 

22.290 

126.272 

2.2 

2.252 

24.116 

128.451 

2.3 

2.216 

22.730 

117.367 

2.3 

2.556 

22.378 

88.159 

2.3 

2.419 

23.058 

103.245 

2.4 

2.292 

22.103 

104.538 

2.4 

2.447 

23.435 

104.468 

2.4 

2.374 

23.006 

106.331 

2.5 

2.490 

22.767 

95.563 

2.5 

2.416 

23.302 

105.697 

2.5 

2.417 

23.236 

105.035 
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TABLE  B-4,  N  =  250  and  Shape  Parameter  Equal  2.5  -  3.0 


Sample 

Number 

Percent 

Censored 

a 

A 

i 

A 

E^x) 

yixi 

1 

0 

25 

24.262 

2.5 

2.535 

21.534 

82.838 

2 

10 

25 

25.532 

2.5 

2.565 

22.668 

89.858 

3 

20 

25 

26.152 

2.5 

2.437 

23.190 

103.054 

4 

0 

25 

25.741 

2.6 

2.465 

22.831 

97.911 

5 

10 

25 

24.870 

2.6 

2.569 

22.082 

85.021 

6 

20 

25 

26.853 

2.6 

2.774 

23.903 

86.829 

7 

0 

25 

25.724 

2.7 

2.653 

22.862 

86.062 

8 

10 

25 

25.816 

2.7 

2.736 

22.968 

82.166 

9 

20 

25 

26.075 

2.7 

2.904 

23.252 

75.714 

10 

0 

25 

25.143 

2.8 

2.687 

22.356 

80.425 

11 

10 

25 

26.269 

2.8 

2.748 

23.375 

84.475 

12 

20 

25 

26.334 

2.8 

2.914 

23.486 

76.778 

13 

0 

25 

24.546 

2.9 

2.704 

21.830 

75.828 

14 

10 

25 

26.268 

2.9 

3.026 

23.466 

71.612 

15 

20 

25 

27.227 

2.9 

2.913 

24.283 

82.091 

16 

0 

25 

25.083 

3.0 

2.901 

22.367 

70.178 

17 

10 

25 

25.996 

3.0 

3.100 

23.248 

67.300 

18 

20 

25 

26.109 

3.0 

3.176 

23.376 

65.154 
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TABLE  Br5,  N  =  250  and  Shape  Parameter  Equal  3.0  -  3.5 


Sample 

Percent 

Number 

Censored 

a 

A 

a 

1 

0 

25 

24.158 

2 

10 

25 

25.033 

3 

20 

25 

25.607 

4 

0 

25 

25.825 

5 

10 

25 

24.663 

6 

20 

25 

25.610 

7 

0 

25 

25.665 

8 

10 

25 

26.050 

9 

20 

25 

25.577 

10 

0 

25 

23.590 

11 

10 

25 

25.560 

12 

20 

25 

25.075 

13 

0 

25 

24.413 

14 

10 

25 

25.768 

15 

20 

25 

24.639 

16 

0 

25 

25.200 

17 

10 

25 

25.484 

13 

20 

25 

25.830 

e. 

i 

E(x) 

VM 

3.0 

3.049 

21.588 

59.806 

3.0 

3.258 

22.441 

57.350 

3.0 

3.189 

22.931 

62.237 

3.1 

3.228 

23.140 

62.027 

3.1 

3.143 

22.070 

59.200 

3.1 

3.389 

23.004 

56.148 

3.2 

3.032 

22.929 

68.138 

3.2 

3.543 

23.454 

53.880 

3.2 

3.443 

22.993 

54.520 

3.3 

3.026 

21.074 

57.762 

3.3 

3.237 

22.906 

60.457 

3.3 

3.614 

22.600 

48.293 

3.4 

3.395 

21.931 

50.869 

3.4 

3.462 

23.171 

54.834 

3.4 

3.685 

22.231 

45.098 

3.5 

3.552 

22.692 

50.221 

3.5 

3.589 

22.960 

50.455 

3.5 

3.481 

23.234 

54.600 

FIGURE  B-1 5  N  =  250  and  Beta  Between  . 1,0  and  1.5 


(X) 

FIGURE  B-2,  N  =  250  and  Beta  Between  1.5  and  2.0 
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(X) 

FIGURE  B-3,  N  =  250  and  Beta  Between  2.0  and  2.5 
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(X) 

FIGURE  B-4,  N  =  250  and  Beta  Between  2.5  and  3.0 
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(X) 

FIGURE  B-5,  N  =  250  and  Beta  Between  3.0  and  3.5 
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APPENDIX  C 
SAMPLE  SIZE  OF  100 
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TABLE  C-1 ,  N  =  TOO  and  Shape  Parameter  Equal  1.0  -  1.5 


Sample 

Number 

Percent 

Censored 

a 

A 

a 

i 

A 

i 

E(x) 

V(x) 

1 

0 

25 

23.178 

1.0 

1.099 

22.373 

415.600 

2 

10 

25 

28.484 

1.0 

0.997 

28.516 

817.443 

3 

20 

25 

30.436 

1.0 

1.082 

29.537 

747.180 

4 

0 

25 

23.569 

1.1 

1.030 

23.286 

511.188 

5 

10 

25 

27.753 

1.1 

1.171 

26.276 

506.755 

6 

20 

25 

26.699 

1.1 

1.137 

25.498 

505.449 

7 

0 

25 

29.079 

1.2 

1.164 

27.580 

565.090 

8 

10 

25 

24.477 

1.2 

1.348 

22.451 

283.365  , 

9 

20 

25 

23.936 

1.2 

1.265 

22.235 

313.330 

10 

0 

25 

21 . 880 

1.3 

1.306 

20.189 

243.187 

11 

10 

25 

25.352 

1.3 

1.295 

23.432 

332.685 

12 

20 

25 

30.068 

1.3 

1.391 

27.433 

399.079 

13 

0 

25 

25.828 

1.4 

1.352 

23.679 

313.640 

14 

10 

25 

25.178 

1.4 

1.468 

22.791 

249.206 

15 

20 

25 

25.076 

1.4 

1.356 

22.978 

293.746 

16 

0 

25 

24.874 

1.5 

1.642 

22.251 

193.331 

17 

10 

25 

25.526 

1.5 

1.381 

23.315 

291.978 

18 

20 

25 

27.881 

1.5 

1.481 

25.209 

299.958 
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TABLE  C-2,  N  =  100  and  Shape  Parameter  Equal  1.5  -  2.5 


Sample 

Number 

Percent 

Censored 

a 

A 

2L 

i. 

E(x) 

V(x) 

1 

0 

25 

25.677 

1.5 

1.970 

22.763 

145.523 

2 

10 

25 

25.802 

1.5 

1.470 

23.353 

261.083 

3 

20 

25 

28.284 

1.5 

1.739 

25.199 

223.403 

4 

0 

25 

24.982 

1.7 

1.761 

22.242 

170.148 

5 

10 

25 

26.609 

1.7 

2.069 

23.570 

142.841 

6 

20 

25 

30.676 

1.7 

1.856 

27.243 

231.938 

7 

0 

25 

25.189 

1.9 

1.884 

22.358 

152.180 

8 

10 

25 

26.307 

1.9 

2.115 

23.299 

134.187 

9 

20 

25 

22.409 

1.9 

1.763 

19.949 

136.580 

10 

0 

25 

23.490 

2.1 

1.890 

20.848 

131.500 

11 

10 

25 

26.654 

2.1 

2.090 

23.608 

140.722 

12 

20 

25 

23.775 

2.1 

2.130 

21.056 

108.133 

13 

0 

25 

25.297 

2.3 

2.182 

22.403 

117.322 

14 

10 

25 

24.436 

2.3 

2.307 

21.649 

99.114 

15 

20 

25 

24.798 

2.3 

2.561 

22.017 

85.016 

16 

0 

25 

24.991 

2.5 

2.298 

22.140 

104.409 

17 

10 

25 

25.673 

2.5 

2.596 

22.802 

88.972 

18 

20 

25 

27.186 

2.5 

2.636 

24.157 

97.178 
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TABLE  C-3,  N  =  TOO  and  Shape  Parameter  Equal  2.5  -  3.5 


Sample 

Number 

Percent 

Censored 

a_ 

A 

2. 

6 

A 

6 

E(x) 

V(x) 

1 

0 

25 

25.572 

2.5 

2.806 

22.773 

77.236 

2 

10 

25 

27.798 

2.5 

2.673 

24.712 

99.212 

3 

20 

25 

25.351 

2.5 

2.774 

22.566 

77.412 

4 

0 

25 

25.412 

2.7 

3.021 

22.699 

67.207 

5 

10 

25 

23.864 

2.7 

2.823 

21.256 

66.566 

6 

20 

25 

25.472 

2.7 

2.836 

22.693 

75.244 

7 

0 

25 

22.876 

2.9 

2.716 

20.347 

65.350 

8 

10 

25 

24.891 

2.9 

3.496 

22.394 

50.329 

9 

20 

25 

25.791 

2.9 

2.498 

22.883 

96.030 

10 

0 

25 

24.755 

3.1 

2.938 

22.086 

66.896 

11 

10 

25 

25.776 

3.1 

3.119 

23.059 

65.494 

12 

20 

25 

26.556 

3.1 

3.371 

23.847 

60.940 

13 

0 

25 

27.079 

3.3 

3.637 

24.414 

55.696 

14 

10 

25 

26.633 

3.3 

3.356 

23.910 

61.756 

15 

20 

25 

25.906 

3.3 

3.346 

23.254 

58.731 

16 

0 

25 

25.802 

3.5 

3.421 

23.188 

56.100 

17 

10 

25 

25.686 

3.5 

3.830 

23.225 

45.922 

18 

20 

25 

25.569 

3.5 

3.434 

22.983 

54.747 
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(X) 

FIGURE  C-2,  N  =  TOO  and  Beta  Between  1.5  and  2.5 
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(X) 

FIGURE  C-3,  N  =  100  and  Beta  Between  2.5  and  3.5 
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ON  THE  ROBUSTNESS  OF  THE  EXPONENTIAL  DISTRIBUTION 

George  C .  Canavos 
Virginia  Commonwealth  University 
Richmond,  Virginia 


ABSTRACT .  This  paper  examines  the  robustness  of  the  expon¬ 
ential  time-to-failure  distribution  when  this  probability  law  is 
compared  against  some  logical  alternatives  such  as  the  Weibull 
and  gamma  distributions  relative  to  estimation  procedures  involving 
the  scale  parameter. 


1-  INTRODUCTION .  Since  the  pioneering  work  on  life  test¬ 
ing  and  reliability  estimation  during  the  early  1950 's  -  see,  for 
example,  [1]  and  [2]  -  the  exponential  distribution  has  been’the 
most  widely  assumed  probability  law  in  describing  times  to  failure 
of  many  types  of  components  and  systems.  There  is  little  doubt 
that  this  distribution  has  played  a  key  role  in  both  theory  and 
application  over  the  past  twenty  or  so  years.  Surely,  therefore, 
it  is  of  continued  interest  to  query,  "What  if  the  assumption 
of  the  exponential  probability  law  does  not  hold?  To  what  extent 
then  will  such  an  occurrence  affect  subsequent  inferences  and 
estimation  procedures  derived  as  a  result  of  and  depending  on 
this  assumption?" 

A  substantive  study  on  the  robustness  of  the  exponential 
distribution  is  hereby  attempted.  Where  possible,  the  treatment 
IS  analytic.  Particular  attention  is  given  to  the  estimation  of 
the  scale  parameter  and  the  ramifications  regarding  the  mean- 
squared  error  (MSE)  of  its  estimate  if  the  exponential  assumption 
does  not  hold.  The  effect  on  the  MSE  is  determined  as  a  function 
of  a  situation  in  which  the  true  sampling  distribution  of  life¬ 
times  is  not  the  assumed  exponential  but  rather  is  either  a 
Weibull  or  a  gamma.  By  following  such  a  procedure,  the  degree  of 
robustness  of  the  exponential  distribution  is  measured  and  quan¬ 
tified. 


2.  THEORETICAL  DEVELOPMENT  OF  ROBUSTNESS.  Let  Xj , Xz , . . . , Xn 
denote  the  times-to-f ailure  of  n  like  items.  Assume  that  these 
lifetimes  follow  the  exponential  distribution  with  probabilitv 
density  function  (pdf) 


f(x;9)  =  i  exp(-9x)  ,  x  >  0  (1) 

where  interest  is  on  the  estimation  of  the  parameter  6 .  By 
appealing  to  the  likelihood  function 
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£(xi ,X2, . . . ,Xn;9) 


n 

-  .1 

^  exp  (— ^ 


one  can  easily  determine  the  minimum  variance  unbiased  estimate 
(MVUE)  of  e  to  be 


e  =  1 


r  1 


Suppose,  however,  that  in  reality  the  lifetimes  Xi,X2,...,Xn  are 
realizations  of  a  Weibull  random  variable  with  pdf 

h(x;9,a)  =  |  x"'^  exp  (-i  x“)  ,  x  >  0  (3) 

where  oi  is  a  shape  parameter .  Again  it  is  a  rather  straightfor¬ 
ward  procedure  to  determine  that  the  MVUE  of  6  in  this  case  is 

n  X  -  ^ 

6  =  y  .  (4) 

i=l " 

Thus,  if  in  reality  the  lifetimes  follow  the  Weibull,  the  optimal 
efficiency  (in  the  classical  sense)  for  estimating  6  is  provided 
by  the  MSE  of  the  MVUE  estimator  (4)  which  reduces  to 

MSE(9)j^  =  1^.  (5) 

Since  the  exponential  distribution  was  assumed  to  accurately 
represent  the  lifetimes  Xj , X2 , . . . ,Xn ,  however,  the  estimate  of  9 
is  determined  by  (2).  Thus,  what  effect  would  the  fact  that  the 
lifetimes  follow  the  Weibull  as  opposed  to  the  exponential  have 
on  the  MSE  of  the  estimator  given  by  (2)?  That  is,  if  in  reality 
X 1  , X2 ,  .  .  .  , Xj^  follow  the  Weibull  with  pdf  given  by  (3),  then  for 
equation  (2) 


MSE(9)  =  var(9)  +  E{9  -  E(9)}2 


where 


E(9)  =  E(  I  -^) 
i=l 


=  fil/a 


r(l  +  -) 

'  a 
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yar ( 0 )  = 


var  (  I 


i=l 


1  ^ 

=  -I  I  var  (xj) 
n  ^  - 1 


i  =  l 


=  0^/“  {r(i  +  h  -  tHi  +  h} 

Ob  Ob 


Hence  after  some  algebraic  manipulation,  the  MSE  with  respect  to 
the  indicated  perturbation  is  expressed  by 


MSE(e)^|„- 

g2/a{r(i  +  _  I)r2(l  +  i)}  -  2n0  “r(l  +  +  n0' 

_ OC  0^  Oli^  _  ■ 

n 


(6) 


where  the  notation  "E|W’’  indicates  assumed  exponential  but  in 
reality  Weibull  sampling.  A  comparison  between  equations  (5)  and 
(6)  provides  a  measure  of  robustness  relative  to  MSE  in  the  assump¬ 
tion  of  the  exponential  distribution  when  estimating  the  scale 
parameter  0.  Numerical  results  are  given  in  the  next  section. 

Analogous  to  the  previous  discussion,  consider  now  the  gamma 
distribution.  As  before,  assume  the  lifetimes  Xi,X2,...,Xn  follow 
the  exponential  with  pdf  given  by  (1).  What  are  the  consequences 
relative  to  the  MSE  of  (2)  if  in  fact  the  more  appropriate  proba¬ 
bility  law  is  the  gamma  with  pdf 

g(x;0,a)  =  — — r  x“"^  exp  (-^)  ,  x  >  0  .  (7) 

r(a)0“  “ 

First,  with  respect  to  (7),  it  is  easy  to  show  that  the  MVUE  of 
0  is 


0 


n 

I 

i  =  l 


an 


while 


MSE(0). 


ii 

an 


(8) 


(9) 


Then  to  determine  the  MSE  of  (2),  consider 


while 


var(0)  =  var  (  I 
i=l 

=  ^  I 

“  i=l 
aQ^ 


Thus,  the  perturbed  MSE  of  (2)  reduces  to 


“S®(9>e|G 


e^{a  +  n(l  -  g)^} 
n 


(10) 


As  before,  the  comparison  between  equations  (9)  and  (10)  should 
reveal  the  degree  of  robustness  of  the  exponential  distribution 
as  measured  by  the  MSE  of  the  scale  parameter  0 . 


3.  NUMERICAL  RESULTS.  To  evaluate  the  robustness  of  the 
exponential  with  regard  to  the  estimation  of  the  scale  parameter 
when  the  true  sampling  distribution  is  the  Weibull,  the  ratio  of 
equation  (6)  to  equation  (5)  is  formed.  The  notion  here  is  that 
since  in  reality  the  lifetimes  follow  the  Weibull  time-to-f ailure 
probability  law,  then  the  best  efficiency  of  the  MVUE  of  0  is 
provided  by  (5).  Thus  the  "perturbed"  MSE  given  by  (6)  should  be 
compared  to  (5).  Table  1  contains  this  ratio  computed  for  several 
values  of  0,  a  and  the  sample  size  n. 

By  a  similar  argument,  the  ratio  of  equation  (10)  to  equation 
(9)  is  formed  to  quantify  the  robustness  of  the  exponential 
relative  to  the  gamma  distribution.  However  in  this  case,  the 
ratio  is  the  simple  expression  given  by 


0^{a  +  n(l  -  a)^}/n 
"  0  Vna  ' 

=  a{a  +  n(l  -  a)^} 


which  is  seen  to  be  independent  of  the  value  of  0 .  For  various 
values  of  a  and  n,  this  ratio  is  given  in  Table  2. 


4.  CONCLUDING  REMARKS.  Based  on  the  results  contained 
herein,  it  is  apparent  that  relative  to  the  estimation  of  the 
scale  parameter,  the  exponential  distribution  is  extremely  sensi¬ 
tive  if  in  reality  the  Weibull  is  the  sampling  distribution  and 
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Table  1 


Ratio  of  MSE{0)g  j^^  to  MSE(9)^ 


n  =  5 


a 

0 

0.8 

0.9 

1.10 

1.20 

j 

1.50 

2.00 

2.50 

5 

6.97 

2.29 

0.71 

0.75 

1.24 

1.86 

2.21 

10 

11.60 

2.93 

0.74 

0.93 

1.77 

2.61 

3.03 

15 

15.46 

3.39 

0.77 

1.05 

2.07 

2.99 

3.41 

20 

18.87 

3.76 

0.80 

1.15 

2.27 

3.23 

3.64 

25 

21.96 

4.08 

0.82 

1.22 

2.43 

3.39 

3.80 

30 

24.81 

4.35 

0.84 

1.29 

2.56 

3.52 

3.92 

35 

27.48 

4.60 

0.86 

1.34 

2 . 66 

3.62 

4.01 

40 

30.00 

4.83 

0.87 

1.39 

2.74 

3.70 

4.08 

45 

32.39 

5.03 

0.89 

1.43 

2.81 

3.77 

4.14 

50 

34.68 

5.22  1 

0.90 

1.47 

2.88 

3.83 

4.19 

n  =  10 


5 

9.38 

2.62 

0.85 

1.15 

2.36 

3.69 

4.40 

10 

16.75 

3.58 

0.98 

1.58 

3.46 

5.20 

6.05 

15 

23.02 

4.28 

1.07 

1.86 

4.08 

5.96 

6.82 

20 

28.61 

4.86 

1.15 

2.07 

4.51 

6.44 

25 

33.71 

5.35 

1.21 

2.24 

4.82 

6.78 

7.60 

30 

38.45 

5.79 

1.27 

2.37 

5.07 

7.03 

7.83 

35 

42.89 

6.18 

1.31 

2.49 

5.28 

7.23 

8.01 

40 

47.10 

6.54 

1.36 

2.60 

5.45 

7.40 

8.16 

45 

51.11 

6.87 

1.39 

2.69 

5.60 

7.54 

50 

54.94 

7.18 

1.43 

2.77 

5.73 

7.65 

8.38 

n  =  20 


5 

14.20 

3.29 

1.13 

1.94 

4.58 

7.33 

8.79 

10 

27.05 

4.86 

1.45 

2.87 

6.83 

10.38 

12.09 

15 

38.14 

6.06 

1.68 

3.47 

8.10 

11.91 

13.63 

20 

48.10 

7.04 

1.85 

3.91 

8.96 

12.87 

14.55 

25 

57.23 

7.90 

2.00 

4.26 

9.60 

13.55 

15.19 

30 

65.73 

8.65 

2.12 

4.55 

10.10 

14.06 

15.66 

35 

73.72 

9.34 

2.22 

4.80 

10.52 

14.46 

16.02 

40 

81.30 

9.96 

2.32 

5.01 

10.87 

14.79 

16.31 

45 

88.53 

10.54 

2.40 

5.20 

11.16 

15.07 

16.55 

50 

95.45 

11.09 

1 

i 

2.48 

_ 

5.37 

11.43 

15.31 

16.75 
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the  shape  parameter  is  less  than  one.  However,  there  is  a  modest 
range  of  the  shape  parameter  -  say  (1.0, 1.3)  -  for  which  there 
is  substantial  robustness  on  the  part  of  the  exponential  distri¬ 
bution.  Moreover,  the  robustness  is  more  apparent  for  smaller 
smaple  sizes  and  smaller  values  of  6. 

For  the  case  involving  the  gamma  distribution,  to  some 
extent  the  opposite  appears  to  hold.  That  is,  for  values  of  the 
shape  parameter  that  are  less  than  unity,  considerable  robustness 
is  apparent  especially  for  small  sample  sizes  with  only  a  modest 
amount  present  in  the  neighborhood  but  on  the  positive  side  of 
one. 


Table  2 

Ratio  of  MSE(§)g|^  to  MSECe)^ 


n 

a 

5 

10 

20 

0.50 

0.875 

1.50 

2.75 

0.60 

0.840 

1.32 

2.28 

0.70 

0.805 

1.12 

1.75 

0.80 

0.800 

0.96 

1.28 

0.90 

0.855 

0.90 

0.99 

0.95 

0.914 

0.93 

0.95 

1.00 

1.000 

1.00 

1.00 

1.10 

1.265 

1.32 

1.43 

1.20 

1.680 

1.92 

2.40 

1.40 

3.080 

4.20 

6.44 

1.60 

5.440 

8.32 

14.08 
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RANDOM  INTERVAL  RELIABILITY 


Gerald  R.  Andersen 
Headquarters,  U.S.  Army  Materiel 
Development  and  Readiness  Cmd. 
5001  Eisenhower  Ave. 
Alexandria,  Virginia 


Abstract.  Simple  expressions  are  derived  for  interval 
reliability  when,  in  addition  to  random  life  and  repair  times, 
the  time  of  request  for  system  availability  and  the  duration 
of  the  mission  occasioned  by  that  request  are  random  variables, 
rather  than  numerical  constants.  The  results  constitute  a 
simple  generalization  of  the  interval  reliability  results  noted 
in  Barlow  and  Proschan  [ 1  ] . 

The  investigation  was  motivated  by  the  desire  to  discourage 
the  extensive  misapplication  of  the  result  of  [ i  ]  p.  82  in 
setting  reliability  values  for  large  scale  Army  systems  in  pre¬ 
development  requirements  documents. 

1.  Introduction.  Let  T  be  a  stochastic  process  whose  value, 
r(t),  at  a  particular  time  t^O,  describes  the  operating  state  of 
some  system  at  time,  t.  We  will  only  consider  systems  with  two 
states,  w  (operable/operating)  or  down  (in  repair).  Specifically, 
we  will  iay  that  the  system  is  up  at  time  t  if  r(t)=l  and  down 
at  time  t  if  r(t)=0.  We  assume  that  r(0)=l  with  probability  one. 

Starting  at  time  t=0 ,  let  Xj^,Yj^,X2,Y2,  •  .  .  denote  the  successive 

lengths  of  time  that  the  process,  F,  spends  in  the  up  or  down 
state,  respectively. 

Let 


(1.1) 


S  =0  and  define  S„  by  setting 
O 


S 


n 


n 


E  T 
V=0 


V 


(1.2) 


Throug'hout  most  of  this  not©  ©ach  of  the  sequences  and 

{y.}  will  consist  of  independent  and  identically  distributed  (IID) 
r.v.*s.  In  this  case,  {S^}  is  the  usual  type  of  renewal  process 
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used  to  study  systems  where  the  X^^'s  are  the  times  to  failure  and 
the  Yj's  are  the  times  to  replacement  or  to  repair  to-original- 
condition. 

Associated  with  this  renewal  process,  is  the  counting 

process  N(t),  where 

N(t)  =  k  and  =  Sj^  (1.3) 


if,  and  only  if, 

<  t  <  (}.4) 

The  "residual  lif  e'"' process ,  C  (t)  ,  defined  by  setting 


c(t)  =  -  •= 


(1.5) 


(t>0)  is  useful  in  investigating  the  probability  that  r(t)-l 
during  various  intervals  of  time. 


Since  N{t)  represents  the  number  of  times  the  process  T (t) 
returns  to  the  up  state  during  the  interval  (0,t),  the  event  that 
g(t)>X  coincides  with  the  event  that  the  system  is  in  the  up 
state  at  time  t  and  remains  in  that  state  for  at  least  36  units  of 
time 


+■ 

0 


®N(t) 


t+if 
— *— 


* - 0 - > 

®N(t)''’^N(t)-H  ®N(t)+l 


In  section  2  we  will  obtain  exact  and  asymptotic  expressions 
for  the  probability  that  C (x)  exceeds  the  quantity  M  when  both 
T  and  M  are  random  variables.  This  probability,  that  the  system 
is  up  throughout  the  interval  [x,  x+M] ,  is  called  interval 
reliability  by  Barlow  and  Proschan  [1  ]  p.  82,  in  the  case  where 
X  and  M  are  non-random.  It  is  interesting  to  note  that  many  Army 
documents,  including  a  guide  on  reliability  techniques  [10] , 
apply  the  result  in  [  1  ]  but  with  the  claims  that  either  x  or  M 
are  random . 

The  mathematics  required  to  make  this  extension  from  the 
well-known  results  in  Barlow  and  Proschan,  or  Gnedenko  [ *f ] ,  or 
Feller  [!)]  is  very  simple,  but  in  some  ways  the  results  are 
reasonably  interesting.  In  spite  of  this,  it  is  doubtful  that 
one  would  announce  the  results  of  such  a  simple  task  if  it  were 
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not  for  the  insane  realism  that  some  practitioners  of  reliability 
inject  into  reliability  "requirements"  as  deduced  from 
mathematical  facts  about  residual  life.  This  topic  is 
expanded  on  in  Example  A  of  section  2. 

In  section  3,  we  note  the  well-known  fact  that  an 
aysmptotic  result  of  section  2  is  the  limit  of  a  statistic 
which  gives  the  percentage  of  time,  during  n  renewals,  that  the 
system  is  up  and  remains  up  for  a  sufficient  amount  of  time  to 
support  a  mission  of  duration  .  A  result  is  then  stated 
concerning  the  asymptotic  normality  of  a  similar  statistic 
(one  representing  the  percentage  of  up-time  that  the  system 
is  available  for  a  mission  of  duration  ) . 

Section  4  is  an  attempt  to  consider  the  interval  reliability 
problem  when  successive  system  life  and  repair  times  are  not 
identically  distributed. 
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2.0  Residual  life;  independent  and  identically  distributed 
case.  Let  the  sequences  tX^}  and  }  of  sedtion  1  be  sequences 

of  independent  and  identically  distributed  positive  random  variables 
(r.v.'s)  and  assxime  also  that  {X^}  and  {Y^}  are  independent  of 

each  other.  Thus,  in  this  section,  the  X^'s  have  the  usual 

interpretation  of  time  to  system  failure  and  the  Y^'s  the  time  to 

replace  or  repair  the  system  to  a  state  which  is  as  good  as  new. 


We  will  denote  the  common  distribution  function  (d.f.)  of  the 
X^'s  by  G,  of  the  Y^'s  by  H  and,  where  appropriate,  use  X  to  refer 

to  one  of  the  X^^ '  s  and  Y  to  one  of  the  Y ^  '  s .  Set  F  equal  to  the 

d.f.  of  T  =  x+Y.  Let  the  positive  r.v.'s  t  and  M  of  section  1  be 
independent  of  each  other  and  of  the  sequences  {X^}  and  {Y^}. 

Denote  the  d.f.'s  of  t  and  M  by  K  and  L,  respectively.  Although 
termed  a  positive  r.v. ,  M  will  be  allowed  to  take  the  value  zero 
with  positive  probability;  especially,  the  case  M=0  with  probability 
one  (a.s.).  This  allows  "availability"  as  well  as  interval 
reliability  statements  to  be  included  in  the  same  expression. 

When  M=0  (a.s.),  the  L=£,  where  e  will  denote  the  unit  d.f.'  ; 


e(y) 


if  y  <  0 
if  y  >  0 


(2.1) 


To  avoid  needless  complications,  we  suppose  that  K(0)=0 
and  G(0)=0  (the  latter  guarantees  that  passage  of  the  system  from 
one  down  state  to  the  next  is  never  instantaneous) .  It  follows 
that  F(0)=0.  Let 


U(t)  =  Z  F*^^(t)  , 
k=l 


(2.2) 


where  F*^  denotes  the  k-  fold  convolution  of  F  with  itself.  It 

is  well-known  that  the  renewal  function  U(t)  <  +«  for  each  t 

(0  <  t  <  +«)  and  U(t)  =  EN(t)  (cf.  section  1).  Consult  Feller  [jj] 

for  facts  about  U,  but  note  that  his  U  counts  S  =0  as  the  first 

o 

renewal  of  the  process  and  so  equals  1+U,  U  being  given  as 

in  (2.2).  The  definition  (2.2)  follows  most  "applied*  probability 
and  reliability  texts  (e.g.  [  1  ]  ,  [  11  ]  , 

The  physical  meaning  of  these  four  sets  of  r.v.'s  is  as  stated 
in  section  1.  Mathematically,  since  we  have  assumed  that  all  r.v.'s 
are  independent,  we  can,  without  loss  of  generality,  take  them  to  be 
defined  on  the  same  probability  space,  . 
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(2.3) 


Set  Q  equal  to  the  d.f .  of  the  r.v.  Z  =  X-M  so  that 

Q(z)  =  /”G(z+y)dL(y) 

0- 

for  all  z  in  . 

Results:  We  shall  now  state  and  discuss  the  results  of  this 
section;  if  a  proof  is  cumbersome  it  is  placed  at  the  end  of  the 
section. 

Theorem  1.  ^  EX,  EY  and  Ex  are  finite,  then 

P(C^>  M)  =  7{K(z)+7(K(z+s)-K(s)  )dU(s)  }dQ(z)  (2.4) 

Thus,  (2.4)  gives  the  probability  that  the  system  is  up  at 
some  randomly  selected  moment  in  time,  t,  and  remains  up  for  a 
random  duration,  M,  of  the  mission  occassioned  by  the  request  at 
time  T.  By  specifying  only  the  •  of  x  in  Theorem  1,  we  have 
the  following 

Corollary  1.  If  the  request  time  t  is  exponentially  distributed 
with  mean  1/X,  (X>0)  ,  then 

P(C  >  M)  =  (l-Pd))"^  /Tl-e"^^)dQ(z)  (2.5) 

^  0 

where  F  is  the  Laplace-Stieltjes  transform  of  the  d.f.  F. 

To  verify  the  corollary  from  (2.4)  just  note  that 
K(s+z)  -  k(s)  =  e^^®  (l-e“^^)  , 

so  that  p(5  >  M)  =  /"{(1-e"^®)  +  (1-e"^®)  /  e'^®dU (s) }dQ (z) 

TO  0 

=  (1+U(X))  /“(l-e"^®)dQ(z) , 
o 

where  U  is  the  Laplace-Stieltjes  transform  of  U.  Equation^:.  ( 2 . 5 ) 
follows  since  U(A)  =  F (X)/ (1-F (X) )  for  all  X>0  (recall  that  F(0)=0) 

Remark  1;  It  is  both  intuitively  and  analytically  obvious  that 
(2.5)  may  be  written  in  the  form 

P(C^>  M)  =  P(t  <  X-m1t  ^  X+Y) .  (2.6) 

Intuitively,  because  the  exponential  distribution  has  no  memory 
and  analytically,  because 

/"(l-e"^^)dQ(z)  =  P(T  <  X-M)  =  P(t  £  X-M,  T  £  X+Y) 
o 
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and 


1  -  f\X)  =  P(t  _<  X+Y)  . 

Remark  2.  The  artificiality  of  the  exponential  assumption  on  t 
can  be  attenuated  somewhat  by  noting  that  if  K  is  taken  to  be  a 
mixture  of  exponentials; 

K(2)  =  Za^(l-e"^v2) ,  (2.7) 

^  0.  for  all  v,  =  0  (that  is,  the  tail  of  K  can  be 

expressed  as  a  Dirichlet  series).  Then  (2.5)  preserves  in  the  form 

P(5  >  M)  =  E  a^,  (l-F(X^))“^  /“(l-e"^vZ)dQ(z) . 

Remark  3.  Set  equal  to  the  d.f.  of  (X-M)  where  S  denotes 
the  function  which  equals  S  if  S>0  and  0  if  S  £  0.  Then  since 

1-e  vanishes  at  0,  the  only  point  on  [0,“)  where  and  Q  differ 
we  can  replace  Q  by  and  write  (2.5)  as 

P(?^(j^)>  M)  =  (1-Q^(X))/(1-F(X))  (2.8) 

This  form  not  only  suggests  easy  computation  (simulation  is  easily 
carried  out  from  (2.6)),  but  it  motivates  the  following  observation 
if  X->0+  (so  that  Et-h-«>)  ,  then,  writing  - t=t  (X)  , 

P(5t.(x)>  M)  -*■  E(X-M)V  (V3^+y2)  ^2.9) 

where  =  EX,  y2  =  EY<+<»  and  E(X-M)''’  £  EX  <  +». 

Just  recognize  the  RHS  of  (2.8)  as  the  ratio  of  the  difference 
quotients  of  Q,  and  F;  passing  to  the  limit  as  X^0+  gives  the 

-j. 

ratio  of  the  means  of  (X-M)  and  X+Y  (which  both  exist  since 
EX  <  +00  and  EY  <  +oo ) . 

As  one  would  expect,  the  limit  in  (2.9)  is  preserved  if  the 
exponentiality  of  request  time  is  dropped  and  t(X)  is  replaced 
by  any  sequence  which  converges  in  probability  to  +«». 

Theorem  2.  Let  yj^=EX  and  y2=EY  be  finite  and  T  non-lattice.  If 
(n^l)  is  a  sequence  of  positive  r.v.*s  and  in  probability, 

then 

P(C^  >  M)  -*■  E(X-M) V(Vii+y2>  (2.10) 

n 


as  n  -*■  +00. 

(The  proof  is  at  the  end  of  this  section.) 
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Remark  4.  A  simple  calculation  shows  that 


E(X-M)''’  =  E(X-m|X>M)  P(X>M)  (2.11) 

Also,  if  we  let  the  minimum  of  two  real  numbers  a  and  b  be  denoted 
by  aAb  and  observe  the  identity 

(a-b)"*^  =  a-aAb, 

then  we  can  express  (2.10)  in  the  following  two  equivalent  forms 

P(5  >  M)  •  P(X>M)  =  ^-^2) 

^n  ’^l  ^2  ^r^2 

as  n  +00. 


When  M=0  a.s.^  the  RHS  of  both  (2.10)  and  (2.12)  reduce  to 
the  so*-called  "availability"  of  the  system; y^/ (y^+y 2)  •  The  last 

relation  in  (2.12)  is  therefore  especially  intuitive  since  it 
shows  directly  the  amount  by  which  the  availability  should  be 
decreased  if  one  wants  to  account  for  the  system  being  up  through¬ 
out  a  mission  of  (random)  duration,  M. 


In  view  of  the  above,  it  would  seem  to  be  appropriate  to  call 

A(M,  =  EOC^ 


P1+U2 


(2.13) 


system  availability  for  missions  of  length  M. 

Remark  5,  When  t£^0,>t>0  are  (nonrandom)  real  nimbers  and  T=t, 
(a.s.)  then  the  classical  limit  of  P(5^>*),  as  t-»-“,  (e.g. 

[1],  [^],  [((])  agree  with  all  the  above-mentioned  forms;  just 

note  that  /G(y)dy  =  ^(y-3r)dG(y)  =  ^(y-X)^dG(y) ,  5  =  1  -  G  . 

Examples ; 


A.  Let  T  be  exponential  as  in  Corollary  1  and,  in  this  first 
example,  let  X  also  be  exponential  with  parameter  0^^  (EX  =  1*2^  =  9^^)  • 
Whenever  X  has  this  distribution  it  follows  from  (2.3)  that 

Q(y)  =  e"®l^  £(0j^)  , 

A 

where  L  is  the  Laplace-Stieltjes  transform  of  L.  Direct  calculation 
then  gives 

/”(l-e“^y)dQ(y)  =  XL(0j^)/(X+0^)  . 
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since 


F(X)  =  e^H(X)/(X+0j^) 
we  find  (using  Corollary  1)  that 

vSiQ-. ) 

p(^>M)  =  - - 

Pj^+(l-H(X))X 


(A.l) 


where  the  distribution  of  M  and  Y  remain  to  be  specified.  To  note 
the  resemblance  to  (2.12)  just  observe  that  L (6 =P (X>M)  ,  in  this 

example.  Of  course,  if  we  let  X-»-0+  in  (A.l),  we  would  obtain  a 
special  case  of  (2.12)  . 

Now,  if  we  further  specify  the  distribution  of  Y  to  be 
exponential  with  parameter  @2/  EY=92^=y2f  we  obtain  (from  (A.l)) 


P(C_>  M)  =  - i-j— 


L(0j^) 


(A. 2) 


Finally,  taking  M  to  be  exponential  also,  (A. 2)  becomes 


p(5  >  M)  =  - ^ 

yj_+(02+X)  ^ 


(A. 3) 


where  y^=EM,  So,  in  this  case,  P(5^>M)  has  the  appearance  of  the 
product  of  two  "availability"  terms. 

If,  instead  of  being  exponential,  M  is  taken  to  be  degenerate 
at  ,  i.e.,  L  (s )  =e  {s-36)  ,  where  e  is  defined  in  (2.1),  it  follows 
from  (A.l)  that 


M)  = 


-^/\i 


(A. 4) 


y3^+(l-H(X))X 


with  the  distribution  of  Y  unspecified. 


Notice  that  if  X->-0  in  (A. 4)  (or  just  use  (2.12))  the  RHS  of 
(A. 4)  is 


yi+y2 


-3^y 


(A. 5) 


It  is  the  almost  exclusive  use/misuse  of  this  formula  that 
causes  one  to  produce  the  variations  on  a  theme  found  in  this  note 
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Forecample,  one  objectionable  use  of  (A. 5)  is  to  specify  X  and 
y_,  and  then  to  set  the  expression  in  (A. 5)  equal  to  some  high  number, 
such  as  .97,  and  solve  for  (This,  of  course,  is  done  with  no 

knowledge  that  the  life  distribution  is  exponential.)  This  y,,  call 
it  y®,  is  then  claimed,  in  advance  development  documents ,  to  be  the 
"required"  mean-time-to-failure  of  the  system;  usually  this  is  a 
complex  military  system  which  has  either  never  been  produced  before 
or  one  for  which  we  lack  a  substantial  base-line  of  experience  under 
a  realistic  mission  profile.  To  make  matters  worse,  this  value  of 
y®  and  a  similarly  derived  value,  y^,  obtained  by  setting  (A. 5)  equal 
to  some  slightly  smaller  nxunber  sucn  as  .94,  are  used  as  the  null 
and  alternate  hypotheses,  respectively,  in  a  statistical  acceptance 
plan.  Note  that  when  this  so-called  acceptance  plan  is  applied,  it 
will  be  to  a  total  population  of  perhaps  one  or  two  systems.  More¬ 
over,  the  system  will  be  constantly  undergoing  design  changes  and 
differing  conditions  of  stress.  Needless  to  say,  such  practices  often 
produce  a  reject  signal  from  the  testing  community.  If,  on  the  basis 
of  experience  and  common  sense,  the  systems  under  test  are  judged  to 
do  their  job  reliably,  at  reasonable  cost  and  more  effectively  than 
any  system  in  the  arsenal,  these  reject  signals  are  properly  ignored, 
but  often  not  without  the  significant  costs  of  re-tests,  check-tests, 
needless  re-design  and  a  near  infinity  of  meetings,  briefings  and 
"analyses". 

The  purpose  then  of  the  present  note  is  to  furnish  Army  statis¬ 
ticians  with  two  more  "degrees -of— freedom"  (mission  and  request  time 
distributions)  in  numerous  formulas  that  will  aid  him  in  convincing 
the  occasional  naive  practitioner  of  reliability  that  applications 
of  (A. 5)  as  described  above  are  a  totally  unrealistic  way  of  setting 
reliability  requirements.  This  can  be  done  by  producing  a  wide 
variety  of  answers  with  judicious  choice  of  distributions  for  mission 
and  request  times.  The  variability  obtained  through  distribution 
which  cannot  be  predicted  might  be  enough  to  convince  the  R&D  com¬ 
munity  to  state  reliability  figures-of-merit  as  goals-to-point-toward 
and  not  hard  requirements  to  be  "demonstrated"  in  some  psuedo-statis- 
tical  test.  The  only  possible  danger  is  that  the  results  stated  in 
this  paper  will  be  misused  in  the  same  way  as  (A.5) .  It  should  be 
emphasized  that  designing  reliable  military  systems  is  of  the  utmost 
importance  and  it  is  not  the  purpose  of  these  remarks  to  argue  other¬ 
wise.  On  the  contrary,  it  is  hoped  that  by  discouraging  an  absurd 
approach  to  setting  reliability  requirements  emphasis  will  be  placed 
on  engineering  reliability  into  new  systems . 

Before  concluding  example  A,  consider  two  additional  distributions 
for  M.  First,  when  M  is  uniformly  distributed  over  (0,T): 


P(C^>  M) 


1-e 


-OiT 


SiT 


y  i _ 

y 1+ (X+6  2 )  ^ 


(A. 6) 
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with  T,  X  and  Y  exponential  as  above. 


The  second  is  when  M  is  normally  distributed  as  NCy^cr) 
conditioned  to  be  positive.  Then 


P(C.>  M) - i ^  exp(-(e  Y - 7) 

M^+ie^+X)  ^  ^2  $(J) 


(ei<j)2^  $(J-a0i) 


where  $  is  the  d.f.  of  the  standard,  N(0,1),  normal  r.v. 


B.  Because  of  the  ease  of  calculation  we  consider  the  case 
when  X  is  Rayleigh  (a) , 


P(X>s)  = 


and  M  is  Rayleigh  (a) .  Then 


EXAM 


V2a2 


where  3  =  aa/ya^+a^  and  so 

E(X-M)‘‘'  =  EX-EXAM  =  a 


(1 - -  ) 

/a^+a^ 


Since  EX 


we  therefore  can  write  (2.13)  in  the  form 


EX+EY  * 


/a^+a 


(B.l) 


=  A(0)  (1 - - -  ) 

/a^+a^ 

Another  simple  application  of  Theorem  2  is  obtained  when  X 
is  Gcimma  (2,3)  : 

P(X>s)  =  e"^®  (l+3s) 

and  M  is  the  square  of  a  N(0,a)  r.v..  Then  (after  some  tedious 
calculations) 


A(M)  =  A(0) 


(l+(1.5)a^3) 

(1+20^3) 


(B.2) 
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C.  We  now  return  to  Theorem  1  and  show  its  relationship  to 
some  known  results  on  availability,  without  using  the  exponential 
assumption  of  Corollary  1,  for  the  request  time  distribution. 

For  this  purpose,  let  the  request  timp  be  a  fixed  constant, 

i.e., 

T(a))  =  T  >  0 

for  all  wen.  Then  K(x)  =  e (x-T) ,  where  e  is  defined  in  (2.1). 
Then  the  function 

¥(8,2)  =  k(s+z)  -  k(s) 

is  equal  to  one  in  the  unbounded  region  defined  by  0  <  s  <  T,  and 
s  +  2  >  T  and  zero  otherwise. 

It  follows  that  the  RHS  of  (2.4)  is  given  by 

/“K(y)dQ(y)+/“/"(K(s+y)-K(s) )dQ(y)dU(s)  =  (C.l) 

0  0  0 

<*>  T  „ 

=  f  dQ(y)  +  /  Q(T-s)dU(s)  >  Q  =  1  -  Q. 

T  ^ 

For  ease  of  computation,  let  X,  Y  and  M  be  exponential  with 

respectively^  Then  it  is  easy  to 

show  that 

U(t)  =  i(t-  i  d-e"®^))  , 

|j  a 

where  y=vij^+li2  and  a=ej^+e2. 

Since,  Q(y)  =  a  exp (-6  ^y)/ (0j^+a)  ,  for  y>0,  we  obtain  ((2.4) 
and  (C.l)) 

"■'V  «>  -  ?  '■‘1+ <=-2> 

Notice  that  if  T-^t",  (C.2)  becomes  (A. 3)  with  X=0  as  it 
should.  Also,  If  yj^=0  m(C.2)then  (23)  of  [12]  is  obtained,  i. 
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Proof  of  Theorem  1.  Proceeding  either  with  a  standard  Renewal 
theory  arguinent  or  directly  from  the  eguation  fo^  the  d.f  •  of 
t  nonrandom,  in  [  1  ]  p.4»C>or  [  II  ]  p.35*f  we  obtain 

P(5^.>  M)  =  ^(t)  +Q’*U(t),  (2.14) 

where  Q  is  given  in  (2.3) .  Alternately,  this  is  a  special  case 
of  a  more  general  (non-identically  distributed)  case  derived  in 
section  4.  Since  K(0)=0  (K=d.f.  of  t) 

00 

/  Q(t)dK(t)  =  /  K(t)dQ(t) 

0  0 

Consider 

/B*U(t)dK(t)  =  /  /■* '5(t-y)dU(y)dK(t) 

0  0  0 

=  /*/°°Q(s)dK(s+y)dU  (y)  and  observe  that  this  last 

integral,  cail°it  I,  is  finite.  This  follows  from  Et<+»,  and 
the  well-known  fact  that  U(y)'^y/lJ  as  since  then 

I  <  /~(1-K(y)  )dU  (y)  =  /”u(y)dK(y)  =  0(Et)  <  +» 

(we  have  made  use  of  the  fact  that  U(y)  (l-KCy))*^  —  y  (1“K  (y) ) ->"0 
if  Ex  <  +®°)  .  Now 

/“Q(s)dK(s+y)  =  -Q(0)K(y)  +  /”k (s+y) dQ (s) 
o  o 

so  that 

/~Q*U(t)dK(t)  =  Q(s)dK(s+y)dU(y) 

o  0  0 

=  Q(0)/'”K{y)dU(y)  -  /”/  K(s+y)  dQ  (s)  dU  (y) 

0  0  0 

=  /*/'”K(Y)dU  (y)dQ(s)  -  /”/  K(s+y)dQ(s)dU  (s) 

0  0  0  0 

=  /”°/”(K(s+y)-K(y)  )dU(y)dQ(s) 

o  o 

Proof  of  Theorem  2.  The  sequence  i"  probability  if  given 

e>0,  A>0  there  exists  an  integer  n^=n^(e,A)  such  that 

P(t^>  a)  ^  1-e  if  n>n^. 
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Letting  be  the  d.f.  of  n^l,  we  can  write 

P„  =  P(e^  >  M)  =  /'°Q(t)dK„(t)  +  /“Q*U(t)dK„(t) 
n  ^n  0  ”  0  ” 

=  I(n)  +  J(n) 

Now,  let  e>0  be  arbitrary  and  choose  A>0  such  that  Q (A) =P (X-M>A) <e 
then , 

0  <  I(h)-=/^  Q{t)dK  (t)  +  /“Q(t)dK^(t) 

0  "  a  n 

<  K^(A)  +  e(l-K^(A)) 

<  Kj^(A)  +  s 


SO  that 

d  ^  liinsup  I{n)£  e  • 

n-^co 

Therefore,  since  e >0  is  arbitrary, 
lim  I  (n)  =  0  . 


For  the  term  J (n) ,  we  of  course  follow  the  usual  proof  and 

use  the  Key-Renewal  Theorem.  This  places  an  integrability  require 

ment  on  Q  which  is  equivalent  (in  our  case)  to  showing  that 
00 

f  Q(t)dt  <  +«>•  This  follows  from  EX  <  +«>.  The  assumption  that  T 

is  non-lattice  is  trivial  in  our  application  and  can  be  guaranteed 
by  requiring,  for  example,  either  X  or  Y  to  have  absolutely 
continuous  d.f.'s. 

The  Key-Renewal  Theorem  then  states  that 
Q*U(t)  ^  /°°Q(v)dy 

y  0 

as  t  ■»■  +»  where  ii=iij^+y2.  In  what  follows,  call  this  limit  B. 

The  argument  for  is  similar  to  the  one  applied  to  That 

is,  by  the  previous  limit,  there  is  some  C  such  that  Q*U(t)  is 
within  a  preselected  distance  6>0  of  B  for  all  t>C=C(6).  Given 
some  oO^the  convergence  to  +<»  of  is  then  used  to  find  an 

n^=n^(e,C)  so  that  P(t^>C)  ^  1-e  if  All  this  allows  us 

to  conclude  that  both 
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and 


rQ*U(t)dK  Ct)  >  (B-5)P(Tj^>C)  >  {B-6)  (1-e)  -  B-e' 


/  Q*U(t)dK  (t)  £  (B+6) 
c  “ 

if  n>n^. 

Since,  also, 

0  <  /  Q+U(t)dK^(t)  =  0(P(T^<  C)  =  0(1) 
as  we  have 

lim  J{n)  =  B  =  i  /  Q(t)dt  . 

It  remains  to  evaluate  the  integral  of  Q:  (Recall  EX<+«>) 

/*Q(y)dy  =  /~/”p(X>y+m)dL(m)dy 
0  0  0 

00  oo 

=  f  f  P(X>y+m)dy  dL(m) 
o  o 

_  /“/~p (x>s)ds  dL(m) 
ora 

=  /”°P(X>s)  /®dL(m)ds 
0  0 

=  /  P(X>s)  P (M^s)ds 
o 

=  EX  -/  P(X>s,  M>s)ds 
0 

=  EX  -/  P(XAM>s)ds 
0 


=  EX  -  E (XAM) 

where  XAM  =  minimum  of  X  and  M.  Clearly,  0  ^  EX-E(XAM)  ^  EX  < 


264 


3.  Additional  Comments  on  the  IIP  Case.  Using  the  stochastic 
model  of  section  2,  the  percentage  of  time,  during  n  renewals  of 
the  system,  that  the  system  is  up  and  remains  up  for  a  sufficient 
amount  of  time  to  support  a  mission  of  length  >£  is  given  by 

Pn  =  P„<*'  =  '  '3-1) 

(n^l) .  (Throughout  this  section,  de  will  be  a  strictly  positive 
real  number.)  Assuming  that  ET=EX+EY<+~,  it  follows  from  the  law 
of  large  numbers  and  Slutsky's  Theorem  (cf.  Cramir  [3]  p.  255)  that 

PnW  ^  E(X-3e)‘^,  (3.2) 

in  probability  as  iisii^+y2=EX+EY. 

Thus,  the  statistic  Pj^(^)  is  a  consistent  estimator  of  the 
quantity  the  ubiquitous  limiting  interval  reliability 

of  [X  ]  and  a  special  case  of  Corollary  1.  The  simple,  practical 
nature  of  probably  explains  the  interest  in  describing 

systems  by  means  of  interval  reliability. 

A  related  statistic  with  similar  intuitive  appeal  is 
1|)„(3f)  =  Z  (X^-36)V  ZX^ 

Clearly,  this  statistic  gives  the  percentage  of  up-time  that  the 
system  is  available  for  a  mission  of  length  36  and  is  a  consistent 
estimator  of  the  quantity  (*)=E  (X-X^)  "‘'/yj  •  From  Corollary  1  of 

section  2,  this  quantity  is  also  easily  seen  to  be  the  limit  of 
the  probability  that  given  that  >  0  as  n-^<»,  when 

T^-^+“>,  in  probability.  ” 

Using  the  work  of  Skorohod  [  <?  ]  Chapter  1,  Sec.  6,  Pyke  [7], 
Pyke  and  Shorack  [  8  ] ,  and  arguments  similar  to  those  in  recent 
work  of  Barlow  and  Proschan  [  2.  ]  ,  it  can  be  shown  (under  additional 
assumptions)  that  /iT  converges  in  probability  to  a 

normally  distributed  r.v.,  N(0,aO*r)),  where  the  variance  can  be 
calculated  explicitly,  in  terms  of  Var  X,  Var  (X-;3?-)''', 

and  the  d.f.G.  The  proof  is  outside  the  scope  of  this  note  and 
will  be  reported  elsewhere. 

The  usefulness  of  such  a  result  is  that  it  places  emphasis 
on  ,  a  directly  measurable  quantity,  rather  than  on  ip  (X)  , 

which  requires  a  distributional  assiamption. 
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(4.0).  Residual  Life:  non-identically  distributing  case. 

In  this  section  we  suppose  only  that  the  sequences  and 

of  positive  random  variables  are  each  sequences  of  independent 
r.v.'s  and,  further,  that  the  sequence  {X^}  is  independent  of  the 

sequence  {Y^}. 

Let  G.  be  the  distribution  function  (d.f.)  of  X^,  i^lf  Hj 

the  d.f.  of  Y.,  j^l,  and  set  equal  to  the  d.f.  of  T^=X^+Y^,  i>l 

As  before,  let  M  be  a  positive  r.v.  with  d.f.  L  and  assume  that  M 
and  the  {X^}  and  {Y^}  sequences  are  independent. 

Set  Q^=  d.f.  of  the  r.v.  Z^=  X^-M,  so  that 

Q.  (2)  =  (z+y)dL(y)  (4.1) 

JL  —00  1 

for  all  2  in  (-«=>,“)  , 

Finally,  observe  that  since  we  have  not  assumed  that  the 
T.,  j^lf  are  identically  distributed  r.v. ' s ,  it  is  possible  for 

the  partial  sums  of  section  1  to  converge  to  some  proper  r.v. 

in  distribution  (and  hence  with  probability  one  (a.s.)  on  ^) . 

For  simplicity,  we  want  to  avoid  this  possibility  and  retain  the 
property  of  IID  r.v.'s  which  states  that  (a.s.).  Thus, 

when  we  consider  an  instance  where  LX ^  converges ,  the  divergence 

of  S  to  +»  will  be  guaranteed (even  though  the  Y. 's  are  not 
n  ^  ,  \ 

identically  distributed)  by  assuming  that  LY^-s-+~  (a.s.). 

Now,  recall  the  definition  of  5^,  for  non-random  t^O ,  given 

in  section  1  and  partition  the  interval  [0,“)_by  the  sequence  of 
partial  sums  S^,  n^O .  Then 

P(5  >  M)  =  L  P(Ct>  M,  S,  <  t  <  S^^^)  (4.2) 

^  k=0 

=  ?  /^P(C+.>  M,  t<S.  ,  Is.  )dP(S,  <>) 

k=0  0  ^  k+1  k  K 

=  L  /S(X,  .  ,-M>t-X,  t-3t<T.  ,)dP(Sj^<5f) 

k=0  0 

=  L  /  P(Z,  .,>  t-^f  )d7T*F.  (.^)  +  P(Z,>  t) 

k=l  0  ^  ^  j=l  ^ 
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where,  after  conditioning  on  Sj^  we  have  used  a  familiar  property 

of  conditional  probabilities  (cf.  Krickeberg  [6]  p.  170  problems 
3  and  4)  ,.  the  independence  of  and  Sj^  and,  for  the  last 

equality,  the  fact  that  the  occurrence  of  the  event  [Z,  t-3f  ] 

implies  the  occurrence  of  the  event  t-3^].  Therefore, 

using  the  d . f . ' s  introduced  above  and  the  usual  notation  for  a 
convolution  product : 
k 

‘7T*Fj(t)  =  (t)=P(Sj^<  t)  , 

we  can  write  (4.2)  in  the  form 


P(Ct>  M)  =  'Qi(t)+  2  *«rr*F.  (t)  (4.3) 

^  ^  k=l  j=l  ^ 

where  l~Qj  and  t^O . 

It  is  easy  to  see  that  under  the  assiamption  of  the  last 
section  (that  is,  where  the  sequences  are  identically  as  well  as 
independently  distributed) ,  the  last  equation  reduces  to  equation 
(2.14)  of  section  2. 


Let  the  r.v.  n (M)  be  the  amount  of  time  that  the  random 
function  t^C^  is  greater  than  M.  If  I  is  used  to  denote  the 

indicator  function  of  the  set  of  positive  real  numbers;  that  is, 

(  1,  y  >  0 

Ky)  =  i  (4.4) 

(.  0,  y  <  0 

then  n (M)  can  be  written  as 


n(M)  =  /  1(5.-  M)dt  (4.5) 

0  ^ 

Of  course,  ti  (M)  may  be  a  defective  r.v.  in  the  sense  that  it 
may  take  the  value  +“>  with  positive  probability.  Taking  expectations 
of  both  sides  of  (2.5)  it  is  easy  to  see  that 

Eri(M)  =  /’^P(5^.>  M)dt  (4.6) 

o  ^ 

whether  the  RHS  is  finite  or  not. 
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We  note  in  passing  that  the  case  when  the  underlying  stochastic 
structure  consists  of  sequences  of  IID  r.v.'s,  the  RHS  of  (4*6)  is 
infinite.  This  fact  might  motivate  one  to  ask  whether  or  not  this 
integral  is  Abel  summable  to  a  finite  value.  That  is /  does 

A(X)  =  M)dt 

0  ^ 

converge  as  X-»-0+?  It  is  amusing  to  recognize  this  integral  as 

P(C  M) ,  where  t(X)  is  an  exponentially  distributed  r.v.  and 

T  C  A  )  ^ 

apply  Remark  4  or  Theorem  2  of  section  2  to  obtain 

A(X)-*-y“^E(X-M)‘*’<+“  as  X^0+,  if  y<+“. 

Alternately,  use  only  the  classical  case  with  M  random;  then 
an  application  of  the  Dominated  Convergence  Theorem  gives 

A(X)  =  /”e"yp(£  T,-i>  M)dy^y"^E(X-M)'^ 

0 

as  X-^Ot. 

•  ^ 

Returning  to  the  non-IID  case  we  can  state  the  following 

"  . 

Theorem  3  ;  If  the  series  Z  ECX^-M)"^  <+“  then 


En(M)  =  Z  E(X  -M)'^  (4.7) 

v=l  ^ 

This  follows  easily.  Just  let  V^(t)  denote  the  general  term 
in  the  series  (4*3)  and  note  that 

Then  since  the  V  are  non-negative  and  integrable  over  [0,"), 

n 

and  the  series  of  integrals  of  the  converge,  equation  (4.7) 

follows  from  a  well-known  result  about  interchanging  summation 
and  integration  (e.g.  page  114,  (2)  [5]).  This  proves  (4.7). 

This  equation  then  shows  that  ti  (M)  is  a  proper  r.v.  and 

ti(M)  =  Z  (X  -  M)"^ 

V=1 

with  probability  one. 
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both  X  is  obtained  by  assuming  that 

^  are  exponential  with  mean  values  6  and  C 

respectively.  l,.en  .  EX„-EX„AM  =  eJ/feV).  ;he„  if 


we  take  e  =  c/v.  Eg (M)  =  c. 
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CONFIDENCE  INTERVALS  FOR  A  SUM  OF  RENEWAL 
PROCESSES  WITH  APPLICATION  IN  RELIABILITY 
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ABSTRACT .  In  reliability  theory,  the  time  flow  of  failures  of  a 
'^^^""constant  failure  rate  component  which  is  replaced  or  renewed  upon 
failure  forms  a  renewal  process.  The  inter-arrival  times  of  failures 
in  this  case  are  independent  identically  distributed  positive  random 
variables.  If  a  system  which  is  composed  of  a  number  of  such  components 
is  considered  to  have  failed  if  one  of  its  components  falls,  then  the 
total  number  of  system  failures  is  a  sum  of  the  individual  renewal 
processes.  The  problem  considered  in  this  paper  is  the  computation  of 
confidence  intervals  for  the  total  number  of  system  failures  over  a  given 
period  of  time  from  total  system  tests  and/or  individual  component  tests. 
Although  the  application  considered  is  one  from  reliability  theory,  the 
results  are  applicable  to  general  sums  of  renewal  processes. 

In  solving  this  particular  problem,  the  reliability  engineer  often 
assumes  that  the  sum  of  renewal  processes  asymptotically  approaches  a 
non-homogeneous  Poisson  process  or,  after  a  long  period  of  time,  a  homo¬ 
geneous  Poisson  process  with  exponentially  distributed  inter-arrival 
failure  times.  For  these  processes,  a  chi-square  distribution  can  be 
used  to  determine  confidence  intervals  for  total  number  of  failures  from 
which  confidenced  reliability  or  MTBF  can  be  determined.  It  can  be  shown, 
however,  that  the  Poisson  process  is  strictly  a  local  property  for  sums 
of  renewal  processes  and  that  confidence  intervals  derived  from  these 
assumptions  are  generally  incorrect.  This  is  shown  by  comparing  the  true 
of  the  number  of  system  failures  with  the  variance  derived  assum¬ 
ing  the  Poisson  process. 

A  scheme  for  computing  confidence  intervals  is  presented  in  which 
the  first  3  moments  of  failure  times  of  the  component  processes  are  used 
to  compute  the  mean  and  variance  of  total  system  failures.  For  a  large 
number  of  components,  the  normal  distribution  adequately  describes  the 
distribution  of  system  failures  from  which  confidence  intervals  can  be 
estimated. 


271 


NOTATION. 

f(t) 

F(t) 

F(t) 

h(t) 

hj(t) 

H(t) 

Hj(t) 

H(t) 

Htrue(t) 

N(t) 

Nj(t) 


pdf  of  Ihter-arrival  times  of  failures; 
cdf  fcorresponding  to  f (t) ; 

1  -  F(t); 

renewal  rate;  the  unconditional  pdf  of  component  failure 
and  subsequent  renewal; 

renewal  rate  for  component  j; 

expected  value  of  the  number  of  system  failures  over  the 
interval  (0,t); 

renewal,  function  for  component  j ;  the  integral  of  hj (t) 
over  the  interval  (0,t); 

point  estimate  of  H(t); 

*• 

true  value  of  H(t); 

number  of  system  failures  over  the  interval  (0,t); 
number  of  failures  of  component  j  over  the  Interval  (0,t); 


number  of  components; 
number  of  component  failures; 


®m 


number  of  missions  over  system  life; 


PN(t) 

R(t,T) 

Rj(t,T) 

Ra^T.^m) 

Rs(t,T) 


probability  of  N  failures  in  time  t; 
reliability  at  time  t  for  an  interval  t; 
reliability  of  the  jth  component; 

average  interval— reliability  over  system  life  for  Interval 
T  and  number  of  intervals  n^; 

system  reliability  at  time  t  for  an  interval  x; 
average  system  reliability; 
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system  time; 

Weibull  shape  parameter; 

Welbull  scale  parameter; 

mean  inter-arrival  failure  time  for  component  j; 

third  central  moment  of  inter-arrival  failure  times 
for  component  j ; 

variance  of  inter-arrival  failure  times  for  component  j; 
and 

interval  or  mission  length  for  which  reliability  is 
required. 

j^.  INTRODUCTION.  The  general  problem  is  to  determine  confidence 
intervals  for  reliability  of  a  series  system  of  components  from  test  data. 
Previous  solutions  to  this  problem  have  been  limited  to  constant  failure 
rate  components,  binomial  mission  reliability  which  is  constant  in  time 
and/or  reliability  for  only  the  first  system  failure  [1,2].  The  case  con¬ 
sidered  in  this  paper  which  is  often  of  more  interest  to  the  reliability 
test  engineer  involves  a  system  comprised  of  mechanical  components  which 
follow  non-constant  failure  rate  distributions.  The  system  is  operated 
continuously  until  failure  of  any  of  its  components  occurs  at  which  time 
the  component  is  replaced  or  renewed  and  system  operation  continued. 

For  the  single  component  which  is  replaced  or  renewed  upon  failure, 
the  renewal  rate  h(t)  describes  the  unconditional  failure  rate  of  the 
component  and  is  derived  from  the  underlying  distribution  of  inter¬ 
arrival  failure  times  [3,4]: 

H(t)  =  f(t)  +  ^‘'f(t-x)h(x)dx.  (1) 

The  renewal  rate  is  distinguished  here  from  the  hazard  or  conditional 
failure  rate  which  describes  failure  of  a  non-repairable  item. 


t 

3 

n 


’j 


Interval  or  mission  reliability  can  be  determined  from  the  renewal 
rate  [5-7]: 


t+T_ 

R(t,T)  =  1  -  /  F(t+T-x)h(x)dx  (2a) 

t 


t+T 

-  1  -  /  h(x)dx  for  small  T.  (2b) 

t 
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(3) 


For  practical  applications,  the  transient  interval-reliability  can 
be  average  over  system  life  to  yield  a  single  time  independent  rella' 
bility  index  that  characterizes  a  given  component: 

1 

“  ““  ^  R(ti.T) 

% 

For  a  series  system  of  components 

^c 

R^(t,T)  =  IT  R^Ct.T) 

j=l 


and 


1  ““ 

n_.  i=l 


TT  RjCti.T) 

j=l  ■’ 


(5) 


The  time  flow  of  failures  of  a  non-constant  failure  rate  component 
which  is  replaced  or  renewed  upon  failure  forms  a  renewal  process  [3]. 

iatar— arrival  times  of  failures  in  this  case  are  independent  iden 
tically  distributed  positive  random  variables.  If  a  system  which  is  com¬ 
posed  of  a  number  of  such  components  is  considered  to  have  failed  if  one 
of  its  components  fails  (series  system  assumption) ,  then  the  total  number 
of  system  failures  is  a  sum  of  the  individual  renewal  processes.  The 
problem  considered  here  is  the  computation  of  confidence  intervals  for 
the  total  number  of  system  failures  over  a  given  period  of  time  from 
total  system  tests  and/or  individual  component  tests.  Although  the  ap¬ 
plication'  considered  is  one  from  reliability  theory,  the  results  are 
applicable  to  general  sums  of  renewal  processes. 


Many  properties  of  renewal  processes  and  sums  of  renewal  processes 
are  covered  in  the  literature;  so  only  the  final  results  are  summarized 
here  [3-7].  If  represents  the  total  number  of  failures  of  com¬ 

ponent  j  over  time  interval  (0,t)  then  for  the  system 


“c  , 

N(t)  =  I  N.(t). 

j“l  ^ 

For  components  which  fall  independently  of  one  another,  the  mean  and 
variance  of  N(t)  is  equal  to  the  sum  of  the  mean  and  variance  of  the 
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component  processes: 


c 

H(t)  =  E{N(t)}  =  I  H.(t)  (7) 

j-1 

dH(t)  “c 

h(t)  =  -  -  I  h.(t)  (8) 

dt  j=l 

“c 

Var{N(t)}  -  I  Var{N  (t)}.  (9) 

j=l  ^ 

For  small  mission  time  Interval  t  and  a  large  number  of  components » 
the  average  reliability  (5)  can  be  shoxm  to  asymptotically  approach  the 
following  value  [5]: 

RsaCr*^)  =  1  -  'i  H(n„T).  (10) 


In  reliability  applications  then,  where  the  above  assumptions  hold,  it 
suffices  to  deal  with  H(t][  for  the  system  with  reliability  being  deter¬ 
mined  from  (10). 

In  considering  the  problem  of  non-constant  failure  rate  components,  the 
reliability  engineer  often  assumes  that  the  sum  of  renewal  processes 
asymptotically  approaches  a  non-homogeneous  Poisson  process  (NHPP)  with 
increasing  number  of  components  or,  after  a  long  period  of  time,  a  homo¬ 
geneous  Poisson  process  (HPP)  with  exponentially  distributed  inter¬ 
arrival  failure  times  [5].  For  these  processes,  the  chi-square  distribu¬ 
tion  can  be  used  to  determine  confidence  intervals  for  total  number  of 
failures  from  which  confidenced  reliability  or  MTBF  (mean- tlme-be tween- 
failures)  can  be  determined.  In  what  follows,  however,  it  is  readily 
shown  that  the  Poisson  process  is  strictly  a  local  property  for  sxims  of 
renewal  processes  and  that  the  global  confidence  intervals  derived  from 
these  assumptions  are  generally  incorrect.  » 


2.  NON-HOMOGENEOUS  POISSON  PROCESS  AS  AN  APPROXIMATION  TO  N(t) 


The  distribution  of  number  of  failures  for  the  NHPP  is  given  as 


PN(t) 


N! 


(11) 


E{N(t)}  =  H(t) 


(12) 
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Var{N(t)}  =  H(t)  (13) 

It  suffices  to  show  that  the  true  variance  of  the  Sum  of  renewal 
processes  does  not  generally  equal  H(t)  as  shown  by  (13).  Consider, 
for  example,  the  asymptotic  renewal  process  for  large  t  in  which  the 
mean  and  variance  for  component  j  are  given  by  [3] 


(14) 

(15) 

(16) 

(17) 


In  general,  H(t)  n*  Var{N(t)}  and  the  sum  of  renewal  processes  for  this 
example  does  not  approach  a  NHPP  or  HPP  in  a  global  sense  no  matter  how 

large  n„  becomes.  For  equal  components,  for  example,  1/y  0^/y^  unless 

This  is  the  case  for  the  exponential  distribution  but  is  only  a 
special  case  for  other  distributions.  Although  the  asymptotic  process 
for  large  t  was  considered,  the  same  can  be  shown  for  the  sum  of 
ordinary  renewal  processes. 


3.  CONFIDENCE  INTERVALS  USING  COMPONENT  MOMENTS.  Since  the  sum 
of  renewal  processes  (6)  is  a  sum  of  discrete,  lattice  type  random 
variables,  it  asymptotically  approaches  the  normal  distribution  as  an 
envelope  with  Increasing  number  of  components  [8].  Confidence  Intervals 
then  can  be  estimated  for  H(t)  using  normal  tables  for  large  number  of 
components  with  H(t)  and  its  variance  being  determined  from  test  data. 
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As  will  be  shown  later,  an  extra  failure  should  be  added  to  H(t)  in 
determining  upper  confidence  limits  to  remove  bias. 

The  renewal  function  for  component  j  can  be  estimated  from  the 
moments  of  the  inter-arrival  times  of  events  for  large  t  [3]. 

Hjo(t)  =  —  +  --■■■■•'  +  0(l/t)  (18) 

yj  2pj2 


Var{Njo(t)}  =  -V  +  (  +  f  -  f  — ;  )  +  0(l/t)  (19) 

^  ^  y/  ^  yj3 

for  the  ordinary  renewal  process  and 

H.  (t)  =  -t.  (20) 

yj 

2  4 

^4  t  1  o.  , 

Var{N.  (t)}  =:  +  - - h  )  +  0(l/t)  (21) 

for  the  equilibrium  renewal  process.  In  the  ordinary  renewal  process 
all  components  are  new  at  t®0.  The  equilibrium  process,  oii  the  other 
hand,  is  one  which  has  been  running  for  a  long  time  before  it  is  first 
observed  (see  Cox  [3],  Chapter  2  for  more  detailed  description  of  these 
processes) . 

Case  1;  Complete  Samples  with  large  t 


For  this  case  the  moments  can  be  estimated  without  making  any  assump¬ 
tion  about  the  underlying  distribution: 


/V 


"y 


(22a) 


i=l 


(xji-y^)  /(iifj-i) 


(22b) 
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(22c) 


i-1 

f 

Var{Hj(t)}  =  Var{Nj(t)}/nfj  (23) 

in  which  x^^,  i=l,...,  iif.  are  n^.  failure  times  for  component  j.  Sub¬ 
stituting  ^^2)  into  (18)f^(19)  aSi  (23)  or  (20),  (21)  and  (23)  yields 

component  estimates  for  Hj (t)  and  Var{Hj(t)}.  System  H(t)  and  its  var¬ 


iance  can  then  be  determined  from  (7)  and  (9)  from  which  confidence 
limits  on  the  true  value  of  H(t)  can  be  estimated  using  noimial  tables. 


Case  li’  Censored  Samples 

For  this  case,  a  theoretical  distribution  for  inter-arrival  failure 
times  must  be  assumed,  such  as  the  Weibull  or  gamma,  with  the  moments 
being  estimated,  for  example,  using  maximum  likelihood.  Confidence 
limits  can  then  be  determined  assuming  the  normal  distribution  for  total, 
number  of  pooled  failures. 

4.  SOME  NUMERICAL  RESULTS  FOR  CASE  1 

A  particular  example  has  been  considered  to  study  the  frequency 
exactness  of  the  confidence  limits  described  above.  For  this  study 
Monte  Carlo  simulation  is  used  to  artificially  generate  sample  outcomes 
for  a  system  with  given  component  parameters;  The  system  is  assumed  to 
be  composed  of  n^,  identical  Weibull  components  with  parameters  n  and  3. 

Using  these  parameters,  failure  times  for  a  given  number  of  failures  are 
generated  for  each  component  using  random  numbers  with  the  quantities 

yjj  0j2  being  computed  from  (22).  From  these  Hj(t)  and 

Var{N^(t)}  are  computed  using  (18)  and  (19)  where  large  t  is  assumed. 

Estimates  for  the  system  H(t)  and  Var  (H(t))  are  then  determined  from  (7), 
(9)  and  (23). 

Assuming  the  normal  distribution  for  H(t),  confidence  limits  on 
H(t)  can  be  determined  from  the  given  set  of  sample  outcomes.  This  is 
repeated  1000  times  for  a  fixed  set  of  parameters.  The  normal  cdf , 
gauf  (H(t)),  is  evaluated  at  the  true  and  known  value  of  H(t)  for  each 
of  these  sample  outcomes.  For  exact  frequency  confidence  intervals, 
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the  function  gauf  (H^j.^g(t))  should  be  uniform  on  (0,1.0).  .  Results 

indicate  that  although  the  confidence  limits  are  not  exact,  they  are 
close  enough  for  practical  purposes. 

Table  I  lists  some  of  the  results  of  these  trials  for  the  upper 
90%  confidence  limit  on  H(t)  (lower  90%  confidence  limit  on  average 
reliability).  An  extra  failure  had  to  be  added  to  the  total  number  of 
system  failures  to  remove  bias.  For  exactness,  the  percent  of  trials 
in  which  ^true  greater  than  the  upper  90%  confidence  limit, 

y\  ' 

HgQ,  should  be  10%.  As  can  be  seen  from  the  results  in  Table  I,  the 

confidence  limits  are  close  to  this  requirement.  The  confidence  limit 

H^q,  therefore,  is  judged  to  be  exact  for  this  case  as  long  as  one  extra 

failure  is  added  to  total  number  of  test  failures. 

The  main  limitations  of  the  above  approach  are  the  requirement  for 
long  system  times  and  large  number  of  components  and/or  failures  for 
exactness.  Also,  in  computing  reliability  from  H(t) ,  small  mission  times 
(high  reliability)  are  required  for  the  approximation  (10).  The  computa¬ 
tional  methods  involved,  however,  are  relatively  straightforward  and  the 
approach  appears  to  be  a  sound  one. 


TABLE  I 

RESULTS  OF  MONTE  CARLO  TRIALS  TO  STUDY  UPPER  90% 
CONFIDENCE  LIMIT  FOR  SUM  OF  RENEWAL  PROCESSES 

NUMBER  OF 
COMPONENTS 


NUMBER  OF 
FAILURES  PER 
COMPONENT 


"true<'-5> 


%  OF  TRIALS 
Htrue  >  H90  * 


10 

10 

51.7 

9.8 

10 

5 

51.7 

10.6 

5 

5 

25.8 

9.6 

2 

5 

10.3 

7.1 

90  PERCENTILE 

OF 

/V 

DISTRIBUTION  GAUF  (H+1, 

2. 

a^) 

H 

WEIBULL  COMPONENT  PARAMETERS:  n=1.0,  6  =3.0 
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ABSTRACT.  The  problem  is  that  of  detecting  anomalie  patterns  in 
environmental  grid  data  approximately  coincident  with  a  point  stimulus 
in  the  region  including  all  data  sources . 

The  particular  case  involved  is  to  replace  the  current,  rather  awk¬ 
ward,  technique  with  a  more  concise  and  efficient  alogorithm  for  detect¬ 
ing  anomalous  growth  patterns  of  tree-ring  chronologies  approximately 
coincident  with  volcanic  eruptions. 

STATEMENT  OF  THE  PROBLEM:  The  problem  I  am  presenting  here  is  a 
problem  arising  in  my  climatology  research  on  estimating  climatic  anoma¬ 
lies  following  volcanic  eruptions.  People  have  long  suspected  that  such 
anomalies  would  occur.  (Franklin,  1783  Diary)  It  seems  as  no  surprise 
to  most  people  that  something  as  majestic  as  a  volcano  should  perturb 
climate  and  yet  compelling  evidence  has  not  been  found,  probably  due  to 
the  short  length  of  meteorological  data  records  available  and/or  improper 
methods  of  analysis. 

I  am  estimating  these  climatic  anomalies  by  computing  a  regression 
model  for  climatic  variables  such  as  seasonal  temperature  and  precipi¬ 
tation  averages  based  on  tree-ring  chronologies.  In  this  way  I  am  hoping 
to  attach  to  a  much  longer  record  of  data.  The  regression  model  is  a 
principal  component  regression  calculation  which  I  discussed  at  this  con¬ 
ference  last  year;  and  uses  continuous  tree-ring  chronologies  and  a  con¬ 
current  meteorological  record  taken  at,  or  near,  the  tree  site  for  which 
the  model  is  computed.  That  is,  for  each  tree  site  there  is  one  model 
for  each  climatic  variable  for  each  season. 

With  these  models,  or  transfer  f imctions ,  1  estimate  the  climatic 
anomalies  following  volcanic  eruptions  by  applying  anomalous  sequences 
of  annual  tree  growth  rings  following  those  eruptions  as  input  to  the 
transfer  function. 

The  problem  I  am  presenting  here  is  how  to  improve  the  accuracy  of 
the  detection  of  anomalous  tree  growth  due  -  probably  -  to  volcanic 
activity  and  to  perform  the  detection  more  economically. 

This  may  not  seem  related  to  telemetry  in  the  usual  sense;  however, 

I  contend  that  it  is,  or  has  within  it,  a  problem  in  multiple  object 
telemetry.  In  this  case,  telemetry  is  interpreted  as  the  receipt  of  a 
signal  transmitted  by  a  sensor  operating  in  an  environment  wherein  the 
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signal  is  supposed  to  contain  information  about  it's  environment. 

In  my  case,  the  sensor  is  the  tree.  The  signal  is  the  chronology  of 
it’s  annual  growth  rings.  These  growth  rings  differ  in  width  in  response 
to  climatic  conditions  present  at  the  site.  Figure  I  illustrates  a  section 
of  a  chronology  and  a  graph  of  the  ring  widths.  As  one  can  see,  this  signal 
looks  very  much  like  many  other  kinds  of  signals  one  may  encounter  in  a 
telemetry  operation. 

The  signal  is  supposed  to  contain  information  about  the  climatic  con¬ 
ditions  at  the  tree  site  during  the  time  that  the  growth  ring  was  influ¬ 
enced.  A  considerable  amount  of  work  done,  and  currently  underway,  at 
the  Laboratory  of  Tree  Ring  Research  at  the  University  of  Arizona  supports 
this  supposition.  The  problem  is  that  not  all  tree  ring  chronologies  are 
indicative  of  climate.  Only  sensitive  trees  have  chronologies  which  re¬ 
flect  their  past  climate  and  then  only  when  properly  interpreted. 

There  are  many  factors  which  influence  a  tree's  response  to  a  partic¬ 
ular  climatic  variable.  Topography  is  the  primary  class  of  these  factors 
which  include:  water  runoff,  exposure  (north  or  shady  side  versus  south 
or  sunny  side) ,  altitude  (growth  season) ,  subsurface  conditions  influenc¬ 
ing  root  structures,  availability  of  ground  water  and  density  of  tree 
growth.  However,  these  factors  arc,  for  the  most  part,  reasonably  con¬ 
stant  over  the  time  period  considered;  that  is,  a  few  hundred  years.  Thus, 
the  sensitivity  of  a  tree  to  climatic  change  can  be  considered  to  be  reason¬ 
ably  constant  except  when  it  is  obviously  not  true  as  in  cases  such  as  fire, 
earthquake,  etc.  Figure  II  illustrates  these  opposite  conditions,  compla¬ 
cent  and  sensitive  trees,  as  a  function  of  topography. 

A  sample  Illustration  of  this  sensitivity  is  shown  when  we  consider 
a  tree  which  is  living  in  an  abundant  environment  (as  seen  by  the  tree) 
with  a  surplus  of  water.  This  tree  would  have  a  "complacent"  ring  series 
because  such  a  tree  will  not  suffer  much,  if  at  all,  during  a  relatively 
dry  growing  season  with  less,  but  still  adequate,  precipitation.  However, 
a  farmer  in  the  same  area  with  a  crop  tuned  to  the  normal  precipitation 
(abundant  from  the  tree's  point  of  view)  might  consider  that  dry  spell  a 
near  disaster.  This  complacency  is  compounded  when  one  notes  that  most 
trees  tend  to  integrate  over  several  years  with  the  emphasis  placed  on 
the  climate  of  the  year  preceding  the  current  growing  season. 

The  point  is  that  one  may  see  that  a  given  species  of  tree  may  have 
many  different  responses  to  highly  similar  climates,  depending  on  the 
specific  locations  of  the  trees  and  the  conditions  preceding  the  current 
growing  season  of  up  to  three  years. 

Now  it  is  possible  to  see  the  nature  of  the  problem  I  am  addressing. 

As  shown  in  Fig.  Ill,  I  have  selected,  as  sensors,  ten  tree  sites;  all 
Douglas  Fir  and  all  with  fairly  high  variance  in  the  chronology  as  an 
indication  of  sensitivity.  These  ten  tree  sites,  indicated  by  the  dots. 
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constitute  a  grid  of  climatic  sensors,  each  of  which  has  a  response 
function  defined  only  for  it’s  own  location,  but,  which  has  been  assumed 
to  be  reasonably  time  invariant. 

Now  the  problem  becomes  somewhat  more  complicated.  This  is  because 
I  am  looking  for  the  result  of  an  unknown,  but  probably  different  response 
fmction  to  the  output  from  another  response  function,  which  is  the  atmos¬ 
phere,  also  unknown  and  responding  to  a  point  stimulus  (the  volcanic  erup¬ 
tion)  .  I t^ is  the  nature  of  this  atmospheric  response  function  that  I  would 
like  to  eventually  learn  something  about  from  the  regression-based  esti¬ 
mates  of  the  climatic  anomalies  mentioned  earlier. 

The  response  of  the  atmosphere  to  this  stimulus  at  some  location  on 
the  earth  is,  most  likely,  some  function  of:  the  type  of  stimulus;  that 
is  large,  small,  duration,  etc;  the  location  of  the  tree  site  (sensor); 
the  time  lag  from  the  eruption;  the  time  of  the  year  and  the  initial  con¬ 
ditions  at  the  time  of  the  year. 

The  response  function  of  the  trees  to  the  atmospheric  (climatic)  con-^ 
ditions  is  some  function  of:  the  season;  it’s  own  serial  correlation; 
it’s  initial  condition  and  it’s  location  (topography).  The  response 
function  of  the  trees  omits  the  physiological  variables  as  I  am  consider¬ 
ing  them  as  explicit  since  I  am  not  modeling  the  tree  growth. 

The  first  part  of  the  project,  which  is  the  subject  of  this  paper, 
was  to  detect  the  anomalous,  indirect  response,  if  any  exists,  of, the 
trees  to  volcanic  eruptions.  To  date,  the  method  of  detecting  these  pos¬ 
sible  anomalous  sequences  of  growth  rings,  or  anomalous  signals,  has  been 
as  follows:  First,  I  considered  only  one  site  at  a  time;  thereby  permit¬ 
ting  me  to  ignore  all  parameters  relating  to  location.  Second,  the  tree 
integrates  over  all  seasons;  so,  for  the  purposes  of  signal  detection,  I 
must  ignore  season.  Now  then,  it  must  be  remarked  that  the  amount  of 
change  in  the  tree’s  variance  due  to  volcanic  activity  may  be  only  a 
very  small  portion  of  the  total  variance  in  the  tree  ring  chronology. 

Assuming  that  the  chronology  is  a  weakly,  stationery,  random  series, 
a  kind  of  signal  averaging  was  accomplished  to  detect  a  possible  average, 
or  typical,  response  signal  of  the  tree  to  specific  "types”  of  volcanic 
eruptions. 

The  tree  ring  data  were  formed  into  a  lagged  array,  as  shown  in  Fig. 

IV,  wherein  the  lag  is  fourteen  years.  The  lag  is  more  than  sufficient 
to  accommodate  the  serial  correlation  of  about  three  years  and  Is  guessed 
to  be  sufficient  time  to  cover  any  lag  of  the  propagation  of  the  atmos¬ 
pheric  phenomena.  This  lag  also  side-steps  two  favorite  cycles:  lunar 
and  solar.  < 

The  data  in  an  array  such  as  shown  in  Fig.  IV  contains  all  of  the 
data  and  as  such  is  referred  to  as:  D^nm>  the  total  ring  array.  A 
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similar  array  is  formed  from  the  columns  of  Dt  such  that  the  date  of  the 
growth  ring  index  (percent  of  normal  growth)  in  the  first  row  of  each 
column  is  the  date  of  a  volcanic  eruption  of  a  specified  class  of  erup-- 
tions  parameterized  by  size  of  eruption  and  the  region  of  the  earth  con¬ 
taining  the  volcano.  This  data  array  is  referred  to  as  the  signal  array 

and  is  denoted  by:  D®  • 

nq 

A  third  array  is  the  backgrotmd  array,  D^;  and  is  the  direct  subtrac¬ 
tion  of  D®  from  D^:  9  D®. 

Now  then,  the  row  averages  of  each  of  these  arrays  were  computed. 

These  constitute  average  growth  curves  of  the  tree  for  a  fourteen-year 
period  under:  normal  conditions,  conditions  coincident  with  volcanic 
activity  of  the  class  specified,  and  under  conditions  excluding  those 
concurrent  with  that  specific  class  of  volcanic  activity. 

A  CHI-square  comparison  was  made  with  the  following  hypotheses: 

1.  That  the  average  growth  curve  of  the  signal  array,  D®,  was  indis¬ 
tinguishable  from  the  average  growth  curve  of  the  total  array,  D^. 

2.  That  the  average  growth  curve  of  the  signal  array,  D®,  was  indis¬ 
tinguishable  from  that  of  the  background  array,  D^. 

3.  That  the  average  growth  curve  of  the  background  array,  D^,  was 
distinguishable  from  that  of  the  total  array,  D^. 

4.  That  the  average' growth  curve  of  the  total  array,  D^,  was  disting¬ 
uishable  from  the  flat  curve  of  the  average  of  the  total  chronology. 

If  all  of  these  hypotheses  are  rejected,  then  the  average  growth  curve 
of  that  signal  array  is  considered  a  probable,  valid  response  to  a  volcanic 
eruption  of  the  class  specified.  From  about  300  cases,  35  passed  this  test 
at  the  99%  confidence  level. 

A  second  test  was  devised  involving  the  comparison  of  the  first  eigen¬ 
vectors  of  the  variance/ CO- variance  matrix  of  the  ring  signal  array,  D®, 
computed  two  ways.  The  variance/ co- variance  matrices  of  the  signal  array 
were  computed:  (1)  using  the  row  averages  of  the  total  ring  array,  d^^  as 
the  mean;  and  (2)  using  the  row  averages  of  the  ring  signal  array, 

D®,  in  the  usual  fashion.  Thus  we  have: 

[(dS..-dt)  (dS  -dt)*] 

IJ  1  ij  t 


(d®ij  -  1 


CSnn  (3*=)  =  m  -  1 


and 


C“  (d**)  =  m  -  1  t 
nn  * 
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Then  extract  the  eigenvectors: 

CS  «')  E  -  E  A  «'> 

nn  nn  nn  '^n 


and 


c®  (d®)  g  (d®)  = 


nn 


nn 


nn 


Next,  compare  ^  and  If  they  are  significantly  different, 

then  the  Array  D®  is  usable  as  an  array  of  tree  ring  data  comprised  of  sig¬ 
nificant  responses.  This  was  a  very  stringent  test  and  out  of  the  35  can¬ 
didates,  only  six  passed. 


The  computer  time  required  to  perform  all  of  these  tests,  for  all  ten 
sites  and  thirty  classes  of  volcanic  eruptions,  was  about  ten  hours  on  a 
CDC  6500.  This  did  not  include  the  comparison  of  the  eigenvectors,  but 
only  their  computation.  Thus,  the  need  for  a  new  method. 

Another,  related,  reason  for  initiating  this  work  is  to  begin  the 
development  of  a  statistical  description  of  tree  growth  which  will  contain 
information  about  both  the  spatial  relationships  of  the  tree  sites;  and, 
simultaneously,  the  temporal  behavior  of  the  individual  tree  sites  and  the 
interrelationship  between  the  two  descriptions  of  the  tree  growth. 

One  of  the  approaches  to  this  problem  I  have  started  is  to  devise  an 
entropy  function  for  each  coliram  of  the  total  array. 

“"j-  ! 


1  *  tree  site  location 
i  =  row 
j  =  column 


where  P 


ij 


is  computed  using  the  statistics  of  the  chronology. 


The  intent  was  to  detect  a  departure  from  normal  growth  during  the  four¬ 
teen  year  period  following  any  year.  The  data  array,  would  then  be 
collapsed  into  a  one  dimensional  sequence  of  entropy  values  for  each  tree 
site.  These  data  streams  could  then  be  considered  as  variables  indexed 
by  location  and  analyzed  by  multivariate  techniques  for  the  time  invarient 
relationship  of  the  time  lagged  behavior  between  each  site.  Furthermore, 
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by  computing  a  conditional  entropy,  the  serial  correlation  of  the  trees 
could  be  accounted  for • 

In  this  way,  it  is  hoped  that  those  tree  sites  with  large  and/or 
corrSa^ervariLice  of  abnormal  behavior  will  be  selected  by  exgenvector 

analysis . 

Another  variation  of  this  method  would  be  to  form  a  lagged  array  from 
one  ^  ^hrpSLipal  components  of  a  spatial  array  of  tree  ring  chronol¬ 
ogies  sampling  an  entire  region.  Then,  to  perform  the  entropy  calculati  n 
of  that  lagged  array.  This  would  highlight  abnormal  growth  occurring 
simultaneously  throughout  the  region. 

Now,  I  would  like  to  hear  any  comments  and  suggestions  the  panel  might 
wish  to  make. 
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Figure  1.  A  segment  of  the  bristlecone  pine  master  chronology 
representing  three  trees  from  900  to  840  B.C. 
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Fig.  Ill 


Distribution  of  Ten  Tree  Sites 
(Sensors)  in  North  America 
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formed  from  data  stream  m  long 


le  d-jj  of  are  the  tree  ring  indicies,  as  computed  from  a  given  chronology  by  the  Laboratory 
of  Tree  Ring  Research  at  the  University  of  Arizona,  corresponding  to  specific  years  of  growth. 


OUTLIER  DETECTION  PROCEDURES  IN 
TRAJECTORY  DATA  REDUCTION 

William  S.  Agee  and  Robert  H.  Turner 
Analysis  and  Computation  Division 
National  Range  Operations  Directorate 
US  Army  White  Sands  Missile  Range 
White  Sands  Missile  Range,  New  Mexico 


ABSTRACT.  Outlier  detection  procedures  are  used  extensively  in  tra- 
jectory  data  reduction  at  White  Sands  Missile  Range  (WSMR) .  There  are 
three  distinct  circumstances  in  which  outlier  detection  procedures  are 
used  in  trajectory  data  reduction.  These  are  recursive  filtering, 
weighted  least  squares  batch  processing  of  trajectory  measurements,  and 
unweighted  least  squares  processing.  Each  of  these  processes  use  a 
different  outlier  detection  procedure.  This  paper  describes  the  use  of 
outlier  detection  procedures  at  WSMR,  the  specific  procedures  used  in  the 
various  data  reduction  processes,  and  the  limits  within  which  each  of  the 
procedures  performs  satisfactorily.  Of  prime  concern  are  the  situations 
in  which  the  outlier  detection  procedures  fail  to  detect  some  obvious 
outliers.  These  undetected  outliers  destroy  automated  data  reduction 
procedures  causing  a  significant  number  of  reruns  with  human  detection 
of  these  outliers.  The  performance  of  various  outlier  detection  proced¬ 
ures,  those  currently  used  at  WSMR 'and  some  others  is  shown  on  typical 
data  sets  for  which  the  procedures  fail.  It  is  hoped  that,  in  addition 
to  obtaining  some  suggestions  on  improving  outlier  detection  used  in 
WSMR  data  reduction,  this  presentation  will  stimulate  further  investiga¬ 
tion  into  outlier  detection  methods  by  Army  researchers. 

1.  INTRODUCTION.  Some  outlier  detection  techniques  for  batch  and 
recursive  processors  which  produce  trajectory  estimates  from  instrumenta¬ 
tion  measurements  are  described. 

Although  there  are  some  outlier  detectors  in  the  batch  processor,  a 
pre-processor  is  necessary  to  eliminate  those  outliers  which  could  ruin 
the  batch  process  beyond  recovery.  This  pre-processor  removes  the  trend 
using  an  unweighted  least  squares  process  and  detects  outliers  using  two 
tests .  A  better  way  of  removing  the  tx'end  is  necessary  when  some  types 
of  outliers  are  present.  Also,  since  some  types  of  outliers  produce  a 
masking  effect  which  makes  sequential  procedures  insensitive,  other  tests 
are  needed.  The  outlier  detectors  are  good  in  the  batch  processor  and' 
very  good  in  the  recursive  processor. 

2.  PRE -PROCESSOR 

a.  Process.  Small  samples  (one  to  four  seconds)  of  10  to  50  measure¬ 
ments  of  eachTbservation  are  fit  to  a  second  degree  polynomial  in  time 
using  unweighted  least  squares. 
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The  observation  model  is 


afl  +  +  a2t.  + 


i  =  1,  n 


or 


Z  =  TA  +  c 

^  2 

where  e  is  random  noise  with  zero  mean  and  a  variance. 
T 

Minimizing  e  e  with  respect  to  A  we  have 

^  T  -1  T 
A  =  (T^T)  T  Z 

and  the  set  of  residuals 


r  =  Z  -  TA 

.  b.  Outlier  Detection.  Sample  skewness  and  kurtosis  coefficients  are 
computed  from  the  residuals 


n 


n 


^7  =  /Ts  I  r./(  I  T;) 
^  i=l  ^  i=l  ^ 


2,3/2 


n  j  n  «  Q 
b  =  (n-3)  I  rJ/(  I  r  ) 

^  i=l  ^  i=l  ^ 

I£  either  or  b2  exceed  their  respective  5%  significance  level  critical 

values,  the  observation  corresponding  to  the  largest  residual  is  deleted 
and  the  entire  process  is  repeated  with  the  remaining  observations. 

We  hope  that  this  initial  process  will  detect  most  of  the  outliers 
automatically  with  as  little  human  intervention  as  possible  and  a  mini¬ 
mum  of  false  alarms.  When  there  are  too  many  outliers  or  a  few  large 
ones  it  is  almost  impossible  to  detect  them.  In  these  cases,  if  the 
presence  of  an  outlier  is  detected,  the  good  observations  adjacent  to  the 
outliers  are  the  ones  rejected. 

c.  Examples.  These  two  ^samples  show  that  the  presence  of  outliers 
can  sometimes  distort  a  curve  fit  so  much  that  outliers  cannot  be  detect¬ 
ed.  Furthermore,  if  the  presence  of  outliers  were  detected,  sometimes 


the  good  observations  are  rejected  while  the  outliers  remain.  Each  sample 
.has  three  obvious  outliers  which  were  not  detected  from  the  first  set  of 
residuals. 

(1)  Example  1.  Assume  some  other  test  could  detect  the  presence  of 
outliers  and  that  the  observation  with  the  largest  residual  was  rejected. 
One  of  the  outliers  would  be  rejected.  The  two  previously  described  tests 
and  rejection  criteria  would  now  sequentially  detect  and  reject  the  two 
remaining  outliers. 

(2)  Data  for  Example  1. 


Obs 

Res(l) 

Res (2) 

Res (3) 

Res  (4) 

.21709 

-.33222 

-.29484 

-.20135 

-.00001 

.21824 

-.31419 

-.26636 

-.17482 

.00001 

.95519 

.44164 

.49745 

.58588 

.94511 

.45245 

.51376 

.93499 

.46522 

.22288 

-.22199 

-.15714 

-.08487 

.00001 

.22405 

-.19391 

-.13101 

-.06642 

-.00002 

.22530 

-.16375 

-.10528 

-.04951 

.00002 

.22652 

-.13161 

-.08006 

-.03424 

.00002 

.22770 

-.09751 

-.05535 

-.02063 

-.00004 

.22900 

-.06128 

-.03100 

-.00852 

.00000 

.23028 

-.02307 

-.00715 

.00195 

.00001 

.23155 

-.01714 

.01622 

.01079 

-.00001 

.23286 

.05940 

.03915 

.01805 

.00000 

.23418 

.10367 

.06162 

.02370 

.00001 

Example  2.  Again  assume  that  some  other  test  could  detect  the 
presence  of  outliers  and  that  the  observation  with  the  largest  residual 
was  rejected.  The  first  point  rejected  would  be  the  good  observation  in- 
between  the  outliers.  Two  outliers  would  be  the  next  to  go.  Further 
application  would  reject  good  observations  and  never  get  the  one  re¬ 
maining  outlier.  The  outlier  detectors  previously  described  don’t  indi¬ 
cate  the  presence  of  outliers *in  any  set  of  residuals. 
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(4)  Data  for  Example  2. 


Obs 

Res(l) 

Res (2) 

Res (3) 

Res (4) 

-1.70987 

-.15777 

-.28786 

-.36369 

-.37731 

-1.70942 

-.00020 

-.03242 

-.08045 

-.10634 

-1.70893 

.10548 

.14636 

-.12669 

,09700 

-1.70845 

.15923 

.24843 

.25767 

.23267 

-1.70793 

.16109 

.27383 

.31254 

.30071 

-1.70741 

.11102 

.22252 

.29127 

.30108 

-1.70682 

.00910 

.09458 

.19393 

.23385 

-1.70626 

-.14478 

-.11009 

.02041 

.09892 

-1.70571 

-.35060 

-.39148 

-.22927 

-.10368 

-1.70510 

-.60828 

-.74951 

-.55502 

-.37389 

-1.70449 

-.91788 

-1.18425 

-.95693 

-.71177 

1.43777 

1.86223 

1.44596 

1.44602 

1.45641 

.86545 

1.16012 

-1.70257 

-2.15818 

1.44667 

.47314 

-.54153 

,  -.17727 

.40876 

d. ,  Conclusion.  More  work  needs  to  be  done  in: 

(1)  Removing  trends  in  the  presence  of  outliers. 

(2)  Determining  whether  the  testing  arid  rejection  of  small  subsets 

of  observations  as  a  one  time  process. is  more  effective  than  the  sequential 
application  of  testing  and  rejecting  of  one  observation  at  a  time. 

3.  BATCH  PROCESSOR 

a.  Process..  This  is  a  weighted  least  squares  process  which  uses^ 
observation  variances  as  weights.  It  produces  all  position  vector  esti¬ 
mates  simultaneously.  It  is. a  nonlinear  process  which  linearized  about 
a  guess  trajectory  and  is  iterated  to  convergence  before  editing.  The 
measurement  model  for  the  ath  observation  at  the  ith  time  point  is 

^ia  “  ^  ^ia 
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2 

where  e.  is  random  noise  with  zero  mean  and  d«  variance. 

la  la 


Solve  for  x  by  minimizing  the  weighted  sum  of  squares 

2 


m 

I  I 

i=l  ael. 
1 


Z.  -h  (x.) 
la  x' 

o. 

la 


with  respect  to  x^. 

b.  Outlier  Detection 

(1)  At  each  time  point  i,  for  each  observation  a  in  the  solution  a 
normalized  residual  is  computed 

*  Z.  -h  (x.) 

*  la  a  i' 


ia  a(r.  ) 
^  la-' 


where  ^  "^he  estimated  residual  variance  approximated  by 

2  2  T  -1  T 

a^(r.  )  -  o:  +  H  (H  WH)  ^  H 
^  la*^  la  ^  a 


H 


ah  (x.) 
1^ 

ax. 

1 


=  I  -p 

ael.  of 
1  la 


If  3<  r^^  <5,  the  respective  observation  is  deleted  temporarily. 


If  |r.  |>5,  the  respective  observation  is  deleted  permanently, 
la 

If  either  of  these  tests  reject  any  observations  the  solution  is 
iterated  to  convergence  with  the  remaining  observations  and  tested  again 
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This  test  indicates  those  observations  whose  residuals  are  not  consistent 
with  their  variance  and  geometry. 

(2)  When  no  more  observations  are  rejected  with  the  previous  test,  a 
sum  of  weighted  residuals  for  each  observation,  over  all  the  time  points 
it  was  processed  is  computed. 

y  * 

R  =  .  _  r. 

a  i|ael^  la 

If  max  |r  I >3,  all  of  the  observations  are  deleted  from  all  further 
ot 

processing}  sll  teinporarily  deleted  observations  are  enabled  and  the  whole 
process  is  reiterated.  This  test  indicates  a  consistent  bias  in  an 
instrument's  set  of  observations. 

4.  RECURSIVE  PROCESSOR 

a.  Process.  This  is  an  extended  Kalman  filter  which  produces  state 
vector  (position,  velocity,  acceleration)  estimates  sequentially. 
Observation  variance  estimates  are  also  produced  sequentially.  The  pre¬ 
dicted  state  estimate  is 

xCk+ljk)  =  F(k)x(k) 

the  corrected  state  estimate  is 

x(k+l)  =  xCk+l|k)  +  K(k)r(k+l|k) 

where  K(k)  is  the  Kalman  filter  optimal  gain  matrix  and 

r(k+llk)  =  Z(k+1)  -  h(k+l)x(k+l|k) - 

is  the  vector  of  observation  residuals. 

The  variance  estimate 

2 

■o^Ck+l)  =  Qi(k+1) 

is  a  steady  state  function  of  the  exponentially  weighted  sum  of  squared 
residuals 
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Q.(k+1)  =  w[Q.(k)+r^(k+l|k)] 
6<w<i 


b.  Outlier  Detection.  For  eacH  observation  i  at  time  k+1,  a  tfwo- 
level  outlier  detection  scheme  is  used  on  the  normalized  residual 


r*(k+l|k) 


r.(k+l|k) 


a^(rp  =  a^(k)  +  H.p£ 


ah^ (x) 

ax~ 


P  is  the  state  covariance  matrix. .  .. 

(1)  I£  r^(k+l|k)>12  reject  the  i^h  observation  for  time  k+1. 

(2)  If  4<r^(k+l |k) <12  update  Q^(k+1). 

(3)  If  0<r^(k+l |k)<4  update  Q^(k+1),  a^Ck+l)  and  x(k+l|k). 
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ABSTRACT .  The  development  of  simulations  of  physiological  systems  has  been 
used  as  a  guide  in  the  design  of  animal  experimentation  used  to  study  such  en¬ 
docrine  functions  as  glucose -insulin  interaction  and  testosterone  dynamics. 

Models  of  pulmonary  respiratory  function  have  been  studied  in  an  effort  to 
redesign  several  pulmonary  function  tests  so  that  particular  system  parameters 
could  be  evaluated  directly  from  test  results. 

Model  development  is  thus  a  useful  procedure  in  studying  physiological 
systems,  for  it  focuses  attention  on  the  cause-effect  relationship  at  each 
stage  of  the  homeostatic  process,  and  thus  integrates  in  a  systematic  way  all 
that  is  known  about  a  particular  system.  In  addition,  the  requirements  and 
constraints  of  the  model  development  clearly  point  out  gaps  in  our  knowledge 
of  overall  system  function,  and  in  an  effort  to  obtain  this  missing  data  one 
can  utilize  the  model  structure  in  designing  the  necessary  experimental  proto¬ 
cols.  The  results  of  these  experiments  will  help  complete  the  model  in  a 
physiological  meaningful  way,  and  once  complete,  the  model  can  be  used  to  study 
the  effects  of  parameter  variation  on  system  response  under  both  normal  and 
pathological  situations. 

The  simulation  can  be  used  in  conjunction  with,  and  as  a  supplement  to, 
animal  experimentation.  For  example,  the  large  number  of  extraneous,  and  possibly 
even  unknown,  factors  which  often  obscure  or  invalidate  the  results  of  live 
animal  experiments  are  not  present  in  the  model.  The  model  user  must  be  able 
to  take  advantage  of  the  resulting  simplified  approach  to  the  physiological 
system,  but  must,  at  the  same  time,  be  careful  not  to  oversimplify  the  complex 
physical  interrelationships  to  the  point  at  which  the  results  are  physiologically 
meaningless. 

This  presentation  will  utilize  several  case  studies  to  demonstrate  the  use 
of  model  development  in  designing  experiments  to  study  overall  system  function, 
subsystem  operation  and  compartment  analysis,  and  parameter  evaluation. 
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1.  INTRODUCTION.  Model  development  is  a  useful  procedure  in  studying  physiological 
control  systems,  for  it  focuses  attention  on  the  cause-effect  relationship  at  each 
step  of  the  control  process,  and  integrates  in.  a  systematic  way  all  that  is  presently 
known  about  the  particular  system.  Models  can  be  presented  in  many  different 
modes,  some  of  which  might  be  scaled  versions  of  the  actual  system,  physical  analogs 
consisting  of  hardware  elements  or  alternative  living  systems,  and  both  analog  or 
digital  computer  simulations.  The  emphasis  in  this  presentation,  however,  will  be 
on  the  mathematical  descriptions  of  system  function  and  the  computer  simulations 
of  these  relationships.  In  particular,  the  application  of  models  in  research, 
teaching,  and  the  design  of  experiments  will  be  discussed  in  terms  of  specific 
examples  of  endocrine  and  respiratory  function. 

Early  application  of  the  control  engineer’s  approach  to  physiological  system 
studies  appeared  in  the  work  of  Grodins^  and  Stark^  in  their  studies  of  respiratory 
function  and  pupillary  motion,  respectively  (1,2).  Grodins*  first  model  of  respira¬ 
tory  function  divided  the  body  into  two  compartments,  the  lungs  and  the  remaining 
tissue.  In  addition,  he  assumed  that  control  of  respiration  was  purely  a  function 
of  carbon  dioxide  concentration  at  particular  sites  within  the  circulation.  Circu¬ 
lation  time  was  also  assumed  to  be  negligible.  Validation  studies  were  then 
performed  on  the  model,  at  which  time  model  results  were  compared  With  known 
experimental  results  from  a  living  system.  Deviations  between  the  model  and  the 
living  system  suggested  several  additions  to  the  model,  which  Grodins  incorporated 
in  subsequent  more  complex  representations.  A  second  model  included  circulation 
time  as  a  non -negligible  parameter,  and  added  the  effect  of  alveolar  dead  space 
to  the  two -compartment  study.  This  more  advanced  model  was  able  to  be  used  to 
study  both  normal  respiratory  function  and  the  abnormal  behavior  associated  with 
Cheyne-Stokes  breathing^.  A  third  model  added  the  brain  compartment  to  the  original 
structure,  and  also  included  the  effect  of  oxygen  concentration  on  respiratory 
control.  The  Grodins  models  illustrate  one  approach  of  model  building,  which 
begins  with  a  simple,  but  non-trivial,  model  and  adds  additional  complexity  to  make 
the  model  results  agree  with  the  results  of  physiological  experimentation. 

Stark,  on  the  other  hand,  used  the  modeling  approach  in  designing  his 
experimental  protocol  to  study  pupillary  diameter  as  a  function  of  light  incident 
to  the  eye.  He  used  a  qualitative  description  of  the  system  to  develop  a  block 
diagram  representing  the  functional  portions  of  the  pupillary  control  mechanism. 
Available  data  could  then  be  used  to  describe  quantitatively  the  overall  closed 
loop  system,  but  it  could  not  be  used  to  develop  the  mathematical  relationships 
between  the  subsystem  variables  within  the  closed  loop.  Stark  then  designed  an 
experiment  which  would  produce  the  necessary  information  on  open  loop  response  in  an 
in  vivo,  physiologically  undisturbed  human  subject.  Incident  light  was  focused  at 
the  plane  of  the  iris  so  that  the  cross  section  of  light  entering  the  eye  was  less 
than  the  smallest  pupil  diameter.  Incident  light  intensity  and  pupil  response  were 


a.  first  published  in  1954 

b.  first  published  in  1959 

c.  Cheyne-Stokes  breathing*  periodic'  increase  and  decrease  in  depth  of  breathing 
(tidal  volume) 
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then  recorded  with  an  infrared  electro-optical  arrangement,  from  which  frequency 
response  curves  could  be  developed.  Transfer  functions  for  the  open  loop  system 
were  then  constructed  and  a  mathematical  description  of  the  overall  system  was  thus 
determined.  Stark  thus  used  a  modeling  approach  to  describe  the  infonnation  flow 
through  the  system,  and  to  see  how  available  data  could  be  used  to  quantitatively 
describe  system  function.  When  such  descriptions  could  not  be  developed,  the 
structure  and  suggested  cause-effect  pathways  within  the  model  could  be  used  to  aid 
in  the  design  of  an  experiment  which  would  produce  the  specific  information  necessary 
for  system  quantification.  Although  this  procedure  was  satisfactory  in  the  case 
of  pupillary  dynamics,  it  is  not  always  possible  to  satisfy  model  requirements 
within  physiological  constraints.  However,  the  modeling  approach  does,  as  a  minimum, 
suggest  guidelines  for  experimental  design  which  would  result  in  the  necessary 
input-output  analytical  relationships  between  system  variables. 

2.  APPLICATION  OF  MODELS.  Models  of  physiological  systems  have  been  used  in  research 
teaching,  and  the  design  of  experiments.  There  are  two  distinct  steps  involved  in 
applying  the  modeling  approach  to  experimental  design.  In  developing  the  model, 
areas  where  the  available  data  are  not  adequate  to  explain  the  operation  of*  the 
system  will  become  clarified,  and  a  study  of  the  flow  of  information  necessary  to 
completely  implement  the  model  will  suggest  tests  and  experimental  procedures  for 
generation  of  additional  data.  Such  an  example  was  discussed  previously  in  the 
description  of  Stark’s  work.  Then,  once  the  model  has  been  developed,  it  may  offer 
a  desirable  alternative  to  living  system  experiments,  where  preparation  time  may 
be  many  hours,  months, .  or  days,  and  where  surgical  or  chemical  intervention  may 
cause  undesirable  side  effects.  Such  experiments  can  be  implemented  on  the  model, 
generally  with  little  difficulty  and  little  loss  of  time.  The  model  can  be  used 
jto  "zero  in"  on  a  best  experimental  protocol,  saving  the  animal  experimentation 
for  the  final  stages  of  exploration.  Thus  the  model  does  not  replace  the  need  for 
animgl  experiments  to  finally  validate  methods  and  conclusions,  but  simply  serves  as 
a  "short  cut"  to  the  final  procedure,  providing  an  easier,  less  expensive,  and  less 
time  consuming  alternative  in  the  overall  investigation. 

The  model  can  also  be  used  to  predict  the  effect  of  system  changes  and  system 
sensitivities  to  structural  and  component  changes.  Using  the  model,  it  is  a  rela¬ 
tively  simple  matter  to  propose  parameter  alterations,  and  to  observe  the  relative 
significance  of  these  changes  the  operating  characteristics  of  the  total  system,  as 
well  as  the  sensitivity  of  the  system  to  these  changes.  This  is  possible  even  for 
variables  and  parameters  which  cannot  be  observed  directly  in  the  physiological 
environment.  This  capability  has  important  research  and  clinical  applications,  since 
it  can  provide  a  means  for  evaluating  the  probability  of  existence  of  various 
pathological  states  and  may  possibly  suggest  the  etiology  of  a  particular  disease. 

The  physiological  model  can  also  serve  as  an  effective  adjunct  in  the  training 
of  bioengineers  and  medical  scientists.  The  model  can  present  problems  in  physio¬ 
logical  dynamics  in  terms  of  cause-and-ef feet  relationships  between  functioning 
parts  of  the  system  and  total  system  operation.  For  example,  it  can  be  used  to 
study  the  response  of  pathological  states  to  various  treatments.  One  important 
attribute  of  such  a  model  is  that  a  "patient"  can  be  constructed  with  any  desired 


301 


pathological  condition,  and  the  student  can  be  exposed  to  this  patient  in  much 
the  same  way  as  he  would  explore  a  clinical  case.  Thus  the  student  can  investigate 
many  varieties  of  disease  states,  propose  and  validate  a  host  of  possible  treatment 
protocols,  and  develop  conceptual  information  about  pathological  dynamics,  all  in 
a  single  model  of  the  physiological  system  of  interest.  At  present,  however,  such 
computerized  models  of  physiological  system  dynamics  are  not  generally  available, 
but  tutorial,  inquiry -response  and  steady -state  simulations  are  available  and 
.finding  growing  acceptance  in  the  educational  community. 

3.  DEVELOPMENT  OF  M0DE15.  The  development  of  a  model  can  be  broken  down  into 
phasiTI  These  a7e  block  diagram  formulation,  data  collection,  mathematical  description 
of  the  data,  and  computer  simulation.  The  first  step  is  the  development  of  a  block 
diagram  based  on  the  known  physical  principles  of  the  system  operation.  This 
diagram  should  display  the  important  characteristics  of  the  system.  This  diagram 
may  be  too  complex  for  Initial  simulation  since  it  will  probably  include  secondary 
functions  which  are  not  critical  to  overall  performance.  In  addition,  the  diagram 
may  contain  physiological  variables  whose  quantitative  relationships  are  either  not 
available  in  the  literature  or  are  extremely  difficult,  if  not  impossible,  to 
determine  by  physiological  experimentation.  Therefore  a  revised  simplified  block 
diagram  must  be  developed.  This  is  generally  a  qualitative  description  of  system 
behavior,  and  at  this  point  quantitative  relationships  must  be  obtained. 

Physiological  experiments  must  now  be  performed  in  order  to  derive  dynamic 
input-output  relationships  for  each  block  of  the  model,  unless  these  data  are 
already  available  from  prior  work.  Static  characteristics  may  provide  useful 
information  for  model  development,  but  they  cannot  provide  the  information  necessary 
for  a  complete  description  of  system  behavior.  The  design  of  the  experiments  should 
consider  the  particular  subject  (e.g.,  human,  dog,  rat,  etc.),  observation  times 
based  on  system  response  times,  quality  and  availability  of  data  analysis  and 
processing  techniques  (e.g.,  chemical  assays),  effect  of  the  procedures  on  altering 
system  physiology  (e.g.,  surgical  and  chemical  intervention),  and  overall  cost 
of  the  procedure.  Thus  the  block  diagram  model  acts  as  a  guide  in  designing  the 
physiological  experiments. 

In  order  to  use  the  experimental  data,  a  mathematical  description  of  the 
data  must  be  obtained.  These  may  be  functions  of  time  when  considering  system 
dynamics.  If,  for  example,  the  blocks  of  the  model  are  assumed  to  represent  linear 
subsystems  or  linearized  approximations  to  non-linear  operation,  the  final  y/g\ 
mathematical  representation  for  each  block  will  be  a  transfer  function  T(s)=..__  , 
where  Y(s)  and  X(s)  are  the  Laplace  Transforms  of  the  output  and  input, 
respectively,  of  the  block.  The  time-domain  description  of  these  functions  may 
be  obtained  using  curve-fitting  techniques. 

This  overall  mathematical  structure  can  be  simulated  on  an  analog  or  digital 
computer  as  an  aid  in  exploiting  the  model.  Once  a  simulation  is  developed  both 
normal  and  pathological  cases  can  be  investigated  by  changing  either  potentiometer 
settings  (analog  simulation)  or  data  values  (digital  simulation) .  Both  analog  and 
digital  computers  have  advantages  and  disadvantages  in  their  application.  The 
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analog  computer  is  the  most  direct  form  of  simulation  since  the  basic  operations 
such  as  integration  and  multiplication  are  carried  out  continuously  in  either  real 
time  or  a  directly  scaled  version  of  real  time.  The  disadvantages  of  this  form  of 
simulation  are  the  necessity  for  amplitude  and  time  scaling,  and  the  complexity  of 
the  wiring  or  patching  which  occurs  as  the  order  of  the  system  increases.  Digital 
computer  implementation  on  either  large  scale  machines  (e.g.,  IBM  370)  or  small, 
scale  minicomputers  (e.g.,  DEC  PDP-8)  is  another  route  for  computer  modeling.  The 
simulation  languages  available  for  use  on  these  machines  (CSMP,  MIDAS,  ISL/8)  provide 
a  direct  method  for  simulating  an  analog  computer  on  the  digital  computer  facility 
without  the  drawbacks  of  patching  wires  or  time  and  amplitude  scaling.  Disadvantages 
of  large  digital  computer  simulation  are  the  general  unavailability  of  on-line  inter¬ 
active  operation  of  the  simulation  languages  and  long  turn -around  times.  Using  a 
minicomputer  can  avoid  these  difficulties,  but  limited  computer  availability  may  be 
a  problem.  However,  as  costs  decrease  and  machine  capability  increases  minicomputers 
are  becoming  more  widely  available  in  biomedical  research  and  education  facilities. 

4.  CASE  STUDIES,  Three  case  studies  will  be  presented  to  demonstrate  the  use  of 
model  development  in  designing  experiments  to  study  overall  system  function,  sub¬ 
system  operation,  and  parameter  evaluation.  In  particular,  the  glucose -insulin  and 
testosterone  endocrine  systems,  and  the  respiratory  system  will  be  discussed. 

4A  .GLUCOSE -INSULIN  HQllEOSTASIS  .  The  development  of  the  glucose -insulin  model 
demonstrates  the  use  of  modeling  in  the  design  of  experiments  in  a  situation  similar 
to  that  of  Stark’s  approach  to  pupillaiy  dynamics  (3,4)..  The  glucose  homeostatic 
system  consists  of  a  complex  interaction  between  subsystems  regulating  hormonal 
release,  glucose  storage,  and  glucose  utilization.  Each  such  perfusion  region  can 
be  viewed  as  a  combination  of  controller  and  plant  working  together  to  control 
glucose  and  insulin  levels.  The  pancreas  and  liver  may  be  considered  primary 
controllers  due  to  their  function  under  both  hypoglycemic  and  hyperglycemic 
conditions,  while  plant  function  is  represented  by  peripheral  tissue  activity. 

A  block  diagram  of  the  primary  interacting  mechanisms  of  glucose-insulin  control 
is  presented  in  Fig.  1. 

Although  a  quantitative  description  of  total  system  function  can  be  obtained 
from  overall  input-output  measurements  (e.g.,  system  plasma  responses),  a  clear 
understanding  of  individual  subsystem  function  and  interaction  within  the  intact 
closed  loop  system  can  only  be  obtained  if  each  block  is  itself  described  quanti— 
tively.  The  modeling  approach  emphasizes  this  fundamental  observation,  and  focuses 
one’s  attention  on  those  experimental  procedures  which  will  yield  the  input-output 
data  necessary  for  subsystem  development  in  a  dynamic  sense.  Total  system  response 
data  is  widely  available  in  the  literature.  For  example,  fundamental  glucose 
tolerance  test  results  can  be  used  to  relate  system  glucose  response  to  glucose 
input  over  the  time  base  of  the  test.  However,  the  data  needed  to  describe  each 
physiological  block  in  the  figure  is  not  generally  available.  A  study  of  the  model 
led  to  the  development  of  an  experimental  protocol  which  satisfied  both  modeling 
requirements  and  physiological  constraints  involved  in  monitoring  system  variables 
for  glucose -insulin  control.  Simultaneous  input  and  output  plasma  concentrations 
for  glucose  and  insulin  were  obtained  for  the  liver,  pancreas,  and  periphery  over 
a  fixed  time  sequence  following  glucose  and  insulin  stimulus,  respectively.  These 
data  were  used  to  derive  mathematical  functions  describing  input  and  output  dynamics 
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for  each  block  of  the  closed  loop.  A  set  of  normoglycemic  glucose  and  insulin 
concentration  curves  in  response  to  a  glucose  load  are  shown  in  Fig.  2.  The 
impulse-like  glucose  load  drives  the  total  system  into  a  temporary  hyperglycemic 
condition,  which  elicited  a  pancreatic  insulin  response.  These  experimental 
results  indicate  an  overreacting  pancreatic  insulin  output,  which  is  mediated  by 
hepatic  insulin  clearance.  Glucose  levels  rose  very  rapidly  throughout  the  system, 
but  began  to  decrease  as  insulin  levels  increased.  Glucose  concentrations  returned 
to  normal  resting  levels  in  a  decaying  oscillatory  pattern,  as  would  be  expected  of 
an  underdamped  higher-order  system. 

The  curves  of  Fig.  3  and  4  describe  arterial  and  hepatic  concentration  of 
glucose  and  insulin  following  insulin  loading.  The  additional  parameter  of  elapsed 
time  after  surgery  is  also  included  in  these  figures.  The  early  post -operative 
(2  hours  after  surgery)  response  is  more  sensitive  and  less  stable  than  the  late 
post-operative  (between  2  and  14  days  after  surgery)  response.  Arterial  glucose 
levels  decrease  almost  70%  from  resting  levels  and  return  more  slowly  in  the  EPO 
than  the  LPO  cases.  Similarly,  hepatic  settling  time  is  much  greater  in  the  EPO 
case.  It  is  also  initially  highly  oscillatory,  perhaps  indicating  a  very  sensitive, 
lightly  damped  system.  Such  differences  between  the  EPO  and  LPO  cases  suggest  a 
possible  test  for  degree  of  recovery  after  surgery. 

Thus,  the  modeling  procedures  have  been  used  as  a  guide  in  the  design  of  an 
experiment^  protocol  which  was  used  to  obtain  the  data  necessary  for  determining 
true  in-vivo  relationships  between  subsystem  variables.  In  addition,  these  sub¬ 
system  studies  have  Indicated  the  possibility  of  developing  additional  diagnostic 
criteria  based  on  dynamic  glucose  subsystem  response. 

4B. TESTOSTERONE  DYNAMICS.  As  another  example  of  modeling  of  physiologic  systems, 
the  testosterone  system  is  considered  (5,6).  Testosterone,  the  male  sex  hormone, 
gives  the  male  his  secondary  sexual  characteristics  such  as  hair  distribution, 
skin  texture  and  voice  quality.  Fig.  5  represents  a  complete  block  diagram  for 
the  testosterone  control  system.  Testosterone  is  secreted  by  the  gonads  and  adrenal 
cortex  and  is  produced  peripherally  through  conversion  of  precursors.  Hypothalamus- 
pituitary  activity  provides  the  primary  control  of  testosterone  ^cretion  through 
the  action  of  releasing  factors  and  the  hormones  FSH,  LH  and  ACTH^ .  In  conjunction 
with  this,  testosterone  removal  mechanisms  such  as  tissue  storage  and  metabolism 
determine  blood  testosterone  concentration. 

This  block  diagram  contains  several  effects  which  can  be  considered  "second 
order".  These  include  FSH  control,  testosterone  secretion  and  the  "short  feedback" 
pathway  in  which  the  hypothalamus  secretion  of  releasing  factors  is  controlled  by 
the  blood  FSH  and  LH  concentrations.  As  described  earlier,  this  total  qualitative 
model  is  considered  too  complex  for  use  in  the  initial  modeling  effort.  A  simplified 
block  diagram,  shown  in  Fig.  6  was  developed  in  which  second  order  effects  were 
eliminated*. 

As  in  the  glucose-insulin  case,  an  experimental  protocol  was  developed  to 
obtain  mathematical  descriptions  of  each  block  of  the  figure.  As  an  example,  to 
mathematically  describe  the  testosterone  disappearance  block,  an  experiment  was 

d.  FSH:  Follicle-Stimulating  Hormone 
LH:  Luteinizing  Hormone 

ACTH:  Adrenocorticotrophic  Hormone 
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designed  in  which  radioactively  labelled  testosterone  was  rapidly  injected 
intravenously  into  a  rat  and  blood  samples  were  obtained  at  specific  times 
following  injection.  These  blood  samples  were  analyzed  for  radioactivity  and 
the  resulting  data  is  shown  in  Fig.  7.  Since  the  experimental  procedure  limits 
all  input  excitations  to  small  perturbations  about  normal  circulatory  steady  state 
levels,  the  model  can  be  considered  to  be  linear.  Thus,  the  curve  of  Fig.  7,  which 
is  the  "step  response*’  of  the  testosterone  disappearance  block,  can  be  used  to 
generate  a  transfer  function  for  this  subsystem.  The  analog  simulation  of  this 
transfer  function  is  shown  in  Fig.  8.  Similar  procedures  lead  to  transfer 
functions  and  simulations  for  the  other  blocks  of  the  model. 

Once  a  working  simulation  is  developed,  experiments  are  performed  on  the 
model  to  validate  its  performance  characteristics  and  to  improve  knowledge  of 
system  behavior.  Hiis  additional  information  can  be  used  to  create  a  more. refined 
model.  If  little  quantitative  information  is  available,  experiments  on  the  model 
may  suggest  physiological  experiments  to  be  performed  to  obtain  such  information. 

The  open  loop  response  of  each  block  of  the  testosterone  model  compared  favorably 
with  experimental  results.  Closed  loop  tests  were  then  performed  on  the  model*. 

As  an  example,  consider  exciting  the  model  with  a  step  of  voltage  at  the  input 
of  the  testosterone  disappearance  block.  This  corresponds  physiologically  to 
a  rapid  intravenous  injection  of  testosterone  at  times  t=0.  Responses  are  observed 
at  the  outputs  of  the  LH  disappearance  and  testosterone  disappearance  blocks, 
corresponding  physiologically  to  the  blood  LH  and  testosterone  concentrations, 
respectively.  The  results  are  shown  in  Fig.  9,  which  displays  the  deviations  from 
baseline  of  these  curves.  As  can  be  seen,  the  blood  testosterone  level  begins  at 
the  injected  level  and  returns  to  baseline  with  some  oscillation  within  24  hours 
after. injection .  The  blood  LH  concentration  begins  below  baseline  in  order  to 
compensate  for  the  increased  testosterone  level.  The  LH  concentration  then  re¬ 
turns  to  baseline,  again  with  a  slight  oscillation,  within  24  hours  after 
injection. 

These  results  are  as  expected  using  a  qualitative  knowledge  of  system  behavior, 
but  there  are  no  quantitative  physiological  data  available  with  which  to  check 
the  results.  It  is  therefore  necessary  to  perform  physiological  experiments 
to  generate  such  quantitative  data. 

RESPIRATORy  FUNCTION,  A  digital  computer  simulation  of  respiratory  function 
has  been  developed,  based  on  the  block  diagram  representation  of  Fig.  10  (7,8,9). 

This  diagram,  unlike  that  of  the  original  Grodin's  model,  includes  all  that  is 
known  about  respiratory  function  and  control,  at  least  in  a  qualitative  sense. 

Once  the  overall  system  is  developed,  each  subsystem  must  be  described  individually, 
and  the  appropriate  interaction  must  be  included  so  that  the  combined  subsystems 
response  to  a  simulated  physiological  input  such  as  intrapleural  pressure  would 
closely  resemble  those  of  the  living  system.  Just  as  in^  the  glucose-insulin  study, 
the  overall  complex  model  was  initially  developed  qualitatively,  and  then  each  sub¬ 
system  was  studied  individually  and  described  mathematically.  Unlike  the  glucose 
case,  an  experimental  protocol  was  not  necessary,  since  each  block  was  described 
from  the  basic  physics  of  the  system  function,  and  the  specific  parameter  values  were 
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A  model  for  glucose-insulin  homeostasis. 

Fig.  1. 
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Glucose  and  insulin  concentration  dynamics  for  a  normo¬ 
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Fig.  2. 
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Hepatic  glucose  response  to  insulin  loading  for 
early  and  late  post  operative  studies. 

Fig.  4. 
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Major  control  pathways  of  testosterone  concentration 

Fig.  5. 
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Output  of  Testosterone  Disappearance  Block 
Due  to  Step  Input  of  Testosterone 

Note:  1  volt  =  0.01  ug  testosterone 


Output  of  LH  Disappearance  Block  Due  to 
Step  Input  of  Testosterone 

Note:  1  volt  =  8.67  X  lO-^ng  NIH-LH-Sll/ml  blood 

Responses  of  the  Control  System  to  a  ^pid 
Intravenous  Injection  of  Testosterone 

Note:  One  second  of  computer  time  is  equivalent 
to  one  hour  of  systeni  time. 

Fig.  9. 
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A  Block  Diagram  ol  the  Human  Respiratory  System 
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Simple  Model 


A;  DRIVING  PRESSURR  CCM  H2O) 

A,C,D  B:  airflow  (LITERS/SEC) 

C:  ALVEOLAR  VOLUME  (LIIERS) 

D:  ALVEOLAR  PRESSURE  (CM  H,0)  • 
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ISL  Output  Plot  for  Example  Run 
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A  DESIGN  FOR  THE  DETECTION  OF  SYNERGY  IN  DRUG  MIXTURES 


P.  V.  Piserchia 
B.  V.  Shah 

Research  Triangle  Institute 
Post  Office  Box  12194 
Research  Triangle  Park,  North  Carolina 


ABSTRACT.  In  Biometrics  [September,  1969],  P.  S.  Hewlett  gives 
a  definition  of  S3mergy  based  on  the  curvature  of  isobars  of  drug 
mixtures.  Specifically,  if  X(e)  and  Y(e)  represent  doses  for  two  drugs 
A  and  B  which  correspond  to  an  ED(0)  response  level  (i.e.,  a  proportion 
0  of  all  individuals  tested  will  show  the  specified  response)  and  if 
(XX(0),  (1-A)Y(0))  represents  a  dose  of  a  mixture  consisting  of  a  pro¬ 
portion  X  of  X(0)  and  (1-X)  of  Y(0),  then  synergy  is  absent  or  present 
according  to  whether  the  proportion  P(X)  of  individuals  responding  to 
the  dose  (XX(0),  (1-X)Y(0))  equals  or  exceeds  0  for  various  values  of 
X;  that  is, 

P(X)  >  0  for  some  X  implies  S3mergism. 

An  immediate  consequence  of  this  definition  which  we  prove  isi 

Suppose  Xq  and  Y^  are  two  doses  (not  necessarily  equivalent) 
of  A  and  B.  Consider  the  straight  line  connecting  X^  and  Y^ 
and  written  as  X  =  XX^,  Y  =  (1-X)  Y^,  0  <  X  ^  1.  Then,  if 
there  exists  a  X^  such  that 

P(X^)  =  (1-Xq)  Y^)  >  max{P(XQ,0),  P(0,Yq)} 

then  there  exists  a  nonlinear  isobar  and,  hence,  synergy  is 
shown  to  occur. 

The  import  of  the  above  derives  from  the  fact  that  a  test  for 
synergy  in  drugs  may  be  performed  with  as  few  as  three  test  groups 
(those  receiving  X^  alone,  those  receiving  Y^  alone  and  those  receiving 

(XqXq,  (1-Xq)  Yq))  and,  perhaps  more  important,  the  doses  X^  and  Y^ 

need  not  be  equivalent. 

1.  INTRODUCTION  AND  DEFINITION  OF  SYNERGY.  In  this  paper,  we 
shall  consider  the  effects  of  two  drugs,  combined  in  various  mixtures, 
on  the  responses  of  some  biological  system  or  organism.  The  principal 
question  of  interest  is  whether  the  phenomenon  of  synergism  occurs. 
Following  Bushby  [1969],  we  say  synergy  between  two  drugs  occurs  when, 
acting  together,  they  evoke  the  same  response  as  when  they  act  sing¬ 
ly,  but  at  lower  concentrations,  or  their  effects  interact  in  a  fashion 
which  is  to  the  advantage  of  the  organism  by  producing  an  otherwise  un- 
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attainable  rise  in  biological  activity. 

Each  of  the  above  concepts  is  related  to  the  nature  of  some  me¬ 
chanism  of  joint  drug  action.  A  substantial  amount  of  effort  has  been 
devoted  to  the  construction  of  mathematical  and  statistical  models  for 
joint  drug  action  (see  Plackett  and  Hewlett  [1967]  and  Ashford  and  Smith 
[1965]  for  a  suitable  list  of  references).  However,  certain  aspects 
of  this  research  appear  to  be  controversial  and  no  comprehensive  and 
overall  acceptable  model  exists.  One  reason  for  this  is  due  to  the 
complex  manner  in  which  the  effects  of  drug  mixtures  are  manifested. 

To  use  the  terminology  of  Hewlett  and  Plackett  [1959]  and  Plackett  and 
Hewlett  [1967],  the  joint  action  of  two  drugs  may  be  similar  or  dis¬ 
similar  according  to  whether  the  primary  sites  of  action  for  the  two 
drugs  are  the  same  or  different.  Alternatively,  the  joint  action  may 
be  non-interactive  or  interactive  if  one  drug  has  either  no  influence 
or  some  influence  on  the  biological  activity  of  the  other. 

These  distinctions  have  given  rise  to  four  situations  as  described 
in  the  following  table: 


Similar 

Dissimilar 

Non-Interactive 

Simple  Similar 

Independent 

Interactive 

Complex  Similar 

Dependent 

Plackett  and  Hewlett  [1967]  further  indicate  that  one  criticism  of 
the  above  classification  is  that  the  "action  of  two  drugs,  whether  in¬ 
teractive  or  not,  may  in  some  sense  be  partially  similar;  similar  and 
dissimilar  actions  should  be  regarded  as  at  opposite  ends  of  continuum 
of  biological  possibilities."  Within  this  context,  the  concept  of  syn¬ 
ergism  is  primarily  related  to  whether  the  effects  of  drug  mixtures  is 
non-interactive  or  interactive  regardless  of  its  position  along  the  con¬ 
tinuum  from  similar  to  dissimilar.  However,  part  of  the  controversy 
associated  with  this  topic  pertains  to  the  equating  of  no  S3m.ergism  to 
only  the  simple  similar  situation.  Hence,  although  there  do  exist  a 
number  of  methods  for  fitting  joint  action  models,  an  alternative 
approach  to  the  concept  of  synergy  which  is  widely  acceptable  to  most 
research  workers  is  required. 

As  a  result,  Hewlett  ' [1969]  has  discussed  the  measurement  of  the 
potencies  of  drug  mixtures  in  terms  of  isobars,  a  procedure  used  in 
pharmacology.  To  construct  an  isobar  for  two  drugs,  the  doses  of  the 
drugs  are  measured  respectively  on  actual  physical  scales  (e.g.,  mg/cc) 
along  the  two  axes  and  h3rpothetical  points  representing  the  dose  pairs 
producing  a  fixed  biological  response  are  plotted  (e.g.,  50%  of  the  in¬ 
dividuals  receiving  such  a  drug  mixture  dose  evoke  some  specified  quantal 
response) .  Of  course,  in  an  actual  situation  these  points  would  have 
to  be  determined  experimentally;  but,  to  elucidate  the  concept  we  shall 
presume  that  the  desired  set  of  points  is  already  known.  An  example 
is  shown  in  the  figure  below  where  the  fixed  points  on  the  two  axes 
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correspond  to  the  doses  for  the  two  drugs  separately  which  lead  to  a 
50%  response  rate  among  the  tested  individuals. 


ED{50) 


Drug  B 


ED{50) 

Drug  A 

Figure  I.  Hypothesized  Isobar  for  two  synergistic  drugs. 

The  curve  in  the  Figure  1  is  called  an  isobar.  If  it  is  a  straight 
line,  then  one  says  that  the  two  drugs  show  "additive  action."  On  the 
other  hand,  if  it  falls  below  the  straight  line  connecting  the  two 
fixed  points,  then  one  says  that  S3mergism  (or  potentiation)  occurs. 

This  definition  tends  to  bypass  the  question  of  similarity  or  dissimilarity 
of  the  joint  drug  action  but  yet  is  consistent  with  lower  concentrations 
evoking  the  same  response  which  Bushby  [1969]  uses  in  describing  S5niergy. 

Hence,  throughout  the  remainder  of  this  paper,  S3mergy  will  be  viewed 
as  curvative  of  isobars,  giving  rise  to  the  following  formal  definition 
of  synergy. 

Let  P(X,Y)  denote  the  proportion  of  individuals  responding  to  a 
mixture  of  drugs  A  and  B,  where  X  =  X  units  of  A  and  Y  =  Y  units  of  B. 

Assume  that  P(X,Y)  obeys  the  following: 

(a)  0  <  P(X,Y)  <  1  for  X  >  0,  Y  >  0, 

(b)  P(X,0)  and  P(0,Y)  are  continuous  and  monotonically 
nondecreasing  functions  of  X  and  Y,  respectively. 

If  for  a  specific  0  there  exists  an  X  or  Y  such  that  P(X,0)  =  0 
or  P(0,Y)  =  0,  denote  X  as  X(0)  and  Y  as  Y(0). 

Now,  suppose  there  exists  a  combination  of  A  and  B  denoted  as 
(X*,Y*)  with  P(X*,Y*)  =  0*  (say),  then  the  combination  (X*,Y*)  is 
said  to  be  synergistic  if  one  of  the  following  conditions  holds: 

Condition  1:  If  neither  X(0*)  nor  Y(0*)  exist  then  (X*,Y*)  is  syn¬ 
ergistic  if  0*  >  P(X,0)  for  all  X  and  0*  >  P(0,Y)  for  all  Y. 

Condition  2:  If  either  X(0*)  or  Y(0*),  but  not  both,  exist  then  (X*,Y*) 
is  synergistic  if  X*  <  X(0*)  and  0*  >  P(0,Y)  for  all  Y,  or,  Y*  <  Y(0*)  and 
0*  >  P(X,0)  for  all  X. 


325 


Condition  3:  If  X(e*)  and  Y(e*)  both  exist  then  (X*,Y*)  is  synergistic 
if 

X*  +_Y1  <  1 
X(0*)  Y(9*) 

Briefly,  condition  (1)  maintains  that  (X*,Y*)  is  synergistic  if 
an  otherwise  unattainable  rise  in  biological  activity  is  achieve 
[Bushby,  1969].  Conditions  (2),  (3)  are,  formally,  Hewlett  s  [1969] 
conditions  for  synergy. 

2.  IMPLICATIONS  OF  THE  DEFINITION.  An  immediate  consequence  of 
the  above  definition  is  the  following  theorem  and  proof. 


Theorem;  Suppose  Xq  and  Y^  are  two  doses  (not  necessarily  equiva¬ 
lent)  of  drugs  A  and  B.  Consider  the  straight  line  joining  (Xq,0)  and 
(0,Yq)  and  written  as  X  =  XX^,  Y  =  (l-A)Yg,  0  <  X  <  1.  Then,  if  there 
exists  a  Xq  such  that: 

Gq  =  P(XqXq,  (I-Xq)Yq)  >  max{P(XQ,0),  P(0,Yq)}, 


then  (XqXq,  (I-Xq)Yq)  is  a  synergistic  combination  of  A  and  B. 

F]roo  f  • 

Case  T:  Suppose  neither  XCe^)  nor  YCS^)  exist.  Then,  by  the  continuity 
assumption,  6^  >  P(X,0)  for  all  X,  and,  6^  >  P(0,Y)  for  all  Y. 

Hence,  (XqXq,  (I-Xq)Yq)  is  synergistic  by  Condition  1. 

Case  2:  Without  loss  of  generality  assume  XO^)  exists  and  YCGq)  does 
not.  Then  again,  by  the  continuity  assumption, 

00  >  P(0,Y)  for  all  Y. 

Also,  P(X(eQ),  0)  =  0Q  >  P(Xq,0),  by  assumption,  and,  through  mono- 
tinicity,  XCS^)  >  X^. 

Therefore,  X(0q)  >  X^  >  X^X^  and  (XqXq,  (l-Xo)Yo)  is  synergistic  by 

Condition  2.  \  n\  ^  -d/v 

Case  3;  If  X(0q)  and  Y(0q)  both  exist  then,  6q  =  P(X(0q),  0)  >  P(Xq,0) 

and,  0Q  =  P(0,Y(6q))  >  P(0,Yq). 

Hence,  by  the  monotinicity  assumption  we  have: 

X(9q)  >  Xq  and  Y(0q)  >  Y^. 


Therefore, 

XqX(0q)  >  XqXq  and  (1-Xq)Y(0q)  >  (I-Xq)Yq, 


326 


and , 

,  Vo  .  ,  >  ,±VIo 

‘o  “  xOg)  ^  *0  Y(e^)  ■ 

Therefore, 

x(eQ)  y(9q) 

and  (^qXq,  (I-Xq)Yq)  Is  synergistic  by  Condition  3. 

Graphically,  the  above  theorem  is  represented  in  Figures  2  and  3. 


Figure  2.  Isobar  of  a  synergistic  response.  P  (X,Y). 


P{X) 


Figure  3.  Synergistic  response  as  a  function  of  X,  (Xq.Yq)  fixed. 
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Notice  the  above  does  not  require  and  to  be  equivalent  doses; 

however,  it  does  require  that  max  P(A)  be  greater  than  both  end  points. 

A 

It  is  not  sufficient  to  show  P(A)  >  AP(1)  +  (1-A)  P(0).  An  example  should 
suffice. 

Consider  the  response  defined  by 

P(X,Y)  =  log^  (X+Y+1)  for  X  +  Y  <  e  -  1, 

=1  for  X  +  Y  >  e  -  1, 

then,  the  isobars  of  P(X,Y)  are  the  lines  X  +  Y  =  const.  Clearly, 
straight  line  isobars  and  by  definition  an  additive  mixture.  However, 
consider  the  response  along  any  line  of  the  form  X  =  AX^,  Y  =  (1-A)  Y^ 

where  X^  >  Y^.  We  have, 

P(X)  =  P(AXq,  (1-X)  Yq)  =  log(XXQ  +  (1-X)  Yq  +  1) 

=  logCXCXQ-Yp)  +  Yq  +  1). 

Certainly,  P(A)  >  AP(1)  +  (1-A)  P(0)  for  every'  0  <  A  <  1,  but  yet, 
by  definition,  the  mixtures  are  additive. 

Figure  4  gives  the  geometry  of  the  situation. 

3 .  OPTIMAL  MIXING .  Associated  with  but  not  equivalent  to  synergy 
is  the  concept  of  the  optimal  mixing  of  two  drugs. 

\ 


Figure  4.  A  non-linear  ,  additive  drug  mixture. 
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We  say  two  drugs  have  an  optimal  mixing  rate  if  there  is  a  ridge 
in  the  response,  P(X,Y),  in  a  straight  ling  direction.  If  the  projection 
of  the  ridge  onto  the  (X,Y)  plane  is  a  line  Y  =  pX  then  we  say  X  and  Y 
have  an  optimal  mixing  rate^  p  =  Y/X. 

The  concept  of  optimal  mixing  is  useful  in  establishing  synergy. 
Suppose  an  optimal  mixing  rate  exists.  Then,  if  X^  and  Y^  are  any  two 

does  of  X  and  Y,  we  have  max  P(X)  =  max  P(AXq,  (1-A)  Y^)  occurs  at  the 

X  X 

intersection  of  the  two  lines : 

(1)  X  =  XXq,  Y  =  (1-X)  Yq, 

(2)  Y  =  pX. 

Solving  for  X,  we  obtain 

X  Yq/CpXq  +  Yq), 

or  equivalently, 

X  =  XqYq/CpXq  +  Yq). 

Y=pXqYq/(pXq+Yq). 

It  is  to  be  noticed  that  optimal  mixing  is  defined  in  terms  of  the 
parameter  p  and  not  in  terms  of  X.  We  mention  this  so  as  to  avoid  con¬ 
fusion  in  piclcing  combinations  of  doses  which  are  not  on  the  line  of 
optimal  mixing.  For  instance,  suppose  optimal  mixing  occurs  in  a  1.1 
ratio.  Then,  p  =  1  and  the  line  of  optimal  mixing  is  Y  ==  pX  =  X. 

Now,  suppose  we  choose  doses  Xq,  Yq  where  Xq  >  Yq.  Then  in  Figure  5, 
we  have 


Figure  5.  Representation  of  a  three  point  design. 
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The  maximum  of  P(X,Y)  along  X  =  XXq,  Y  =  (1-X)  Yq  occurs  at  the 
intersection  of  X  =  XXq,  Y  =  (1-X)  Yq  and  Y  =  X.  It_does_notoccur 
when  X  =  1/2.  Keeping  this  in  mind,  selection  of  combination  doses 
becomes  a  more  rational  procedure. 

4.  DESIGN  AND  ANALYSIS.  Having  defined  synergy,  we  now  proceed 
to  give  certain  methods  use?ul  in  showing  synergism  if  it  exists. 

The  simplest  design  is  the  three  point  design.  For  a  three  point 
design,  one  chooses  doses  of  A  and  of  B  and  a  combination 

(XXq,  (1-X)  Yq)  of  A  and  B.  Synergism  is  then  said  to  exist  if  one  can 
show 

P(X)  =  P(XXq,  (1-X)  Yq)  >  max{P(XQ,0),  (0,Yq)} 

We  propose  to  do  this  by  testing i 

Hq:  P(X)  S  max(P(XQ,0),  P(0,Yq)) 

against  the  alternative: 

H^:  P(X)  >  max(P(XQ,0),  P(0,Yq)). 

The  test  statistics  used  will  be  the  simple  large  sample  normal 
test  for  differences  between  two  binomial  proportions.  However,  the 


where  P(Xq,0),  P(0,Yq)  and  P(X)  are  the  observed  proportions  of  indiv¬ 
iduals  responding  at  doses  X^  and  Y^  and  combination  (XX^,  (1-X)  Yq), 
respectively,  with  Q(Xq,0),  Q(0,Yq)  and  Q(X)  being  the  respective  pro¬ 
portions  not  responding.  Letting  =  .05  we  obtain 
Letting  =  .01,  we  have  =  Z  ^q  =  1.285. 
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Notice  that  in  the  above,  no  assumption  is  made  about  the  equivalence 
of  Xq  and  This  is  not  assumed  because  it  is  not  necessary  to  choose 

equivalent  doses  to  establish  synergy.  Also,  no  assixmptlon  is  made  about 
X.  Again  this  is  done  because  no  assumption  concerning  X  (other  than 
0  ^  X  <  1)  is  necessary.  However,  intuitively,  the  efficiency  of  the 
test  procedure  should  be  greatest  when  P(X)  is  maximum.  Therefore  X  should 
be  chosen  such  that  the  combination  lies  on  the  intersection  of  the  line 
connecting  and  and  the  line  of  optimal  mixing  as  given  in  section 

3  of  this  paper. 

The  Tables  I-IV  present  minimum  sample  sizes  needed  to  detect  sy-- 
nergy  for  various  values  of  =  P(X,0)  ~  P(0,Y)  *  P^  and  P(X)  =  P^  >  P^. 

The  four  tables  give  required  sample  sizes  for  significance  levels  .05 
and  .01  and  power  .80  and  .90. 

If  we  define  Z-  and  Z-  as  the  (l-a)-th  and  (1-3) ~th  percentage 

X— CX  X— p 

points  of  the  normal  (0,1)  distribution  respectively  and  if  we  let 
=  /P^(l-P^)  and  =  i/p^(l-P^)  then  the  formula  for  determining  N, 
the  total  sample  size,  is  given  by: 

B  .  (,5 

2  2 
where  a  is  the  significance  level  of  the  test  and  (1-3)  is  the  power 

of  the  test. 

determine  N^,  and  for  a  given  N  allocation  is  carried  out 

Bj  -  Bj  -  i  (B  -  Bj) . 

Integer  values  for  N,  N^,  N^  and  were  determined  by  rounding  off 
the  values  determined  by  the  formulae  so  that  =  N. 

5 .  SUMMARY .  Beginning  with  an  intuitively  appealing  defihition  of 
synergy  given  by  Hewlett  [1969],  we  have  attempted  in  this  paper  some 
exploration  of  the  implications  of  this  definition,  tried  to  dispel  cer¬ 
tain  naive  notions  concerning  the  analytic  characterization  of  synergy 
and  concerning  the  optimal  mixing  of  drugs.  Too,  we  have  suggested  a 
testing  procedure  to  determine  the  existence  of  S)niergy  and  have  given 
sample  sizes  required  to  detect  it. 

The  techniques  discussed  in  this  paper  are  illustrated  in  the  follow¬ 
ing  example. 


To 


by; 


and 
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Suppose  we  wish  to  detect  synergy  in  a  mixture  of  drugs  A  and  B. 
Further  suppose  we  know  1  unit  of  A  is  approximately  equivalent  to  3 
units  of  B  and  that  A  and  B  have  an  optimal  mixing  rate  of  1  part  A  to 
2  parts  B.  Now,  denoting  A  as  X  and  B  as  Y  we  have  =  1.0,  Yq  =  3.0 

and  Y  =  pX  =  2X.  To  derive  the  best  combination  of  A  and  B  we  find 
X  =  XqYq/(pXq  +  Yq)  =  .60  units  of  A, 


and 

Y  =  pXqYq/(pXq  +  Yq)  =  1.20  units  of  B. 

Now,  suppose  Xq  =  1  and  Yq  =  3  are  approximately  ED(.50)’s  of  A 

and  B  and  it  is  suspected  that  the  combination  (.60,  1.20)  gives  an 

2 

expected  cure  rate  of  .70.  Then,  for  an  a  =  .05  level  test  with  power 
.80  we  find  N  =  144  when  =  .50  and  P^  =  .70.  We  find 

and  by  the  following: 

Nx  =  Na^/(/2  +  op 

=  (U4)  (/(17)(.3))/(t^  X  /(.5)(.5)  +  /(.7)(.3)) 

=  56.62. 

N^  =  Nx  =  ICN  -  N^)  =  -  56.62) 

=  43.68. 

Hence,  we  take  56  experimental  units  for  the  combination  (.60,  1.20) 
and  44  each  for  the  Individual  applications  of  A  (1  unit)  and  B  (3  units). 


332 


Minimum  Sample  Size  for  Detecting  Synergy 

Table  I 


Significance  Level  ,05  Power  .80 


X  Y 

.4 

.5 

.6 

.7 

.8 

.9 

.3 

544 

139 

62 

34 

20 

12 

.4 

0 

596 

147 

63 

32 

17 

.5 

0 

0 

600 

144 

59 

28 

.6 

0 

0 

0 

555 

126 

46 

.7 

0 

0 

0 

0 

462 

96 

.8 

0 

0 

0 

0 

0 

315 

Table  II 


Significance  Level  .05  Power  .90 


X 

.4 

.5 

.6 

.7 

.8 

.9 

.3 

751 

192 

84 

45 

26 

15 

.4 

0 

823 

204 

86 

44 

23 

.5 

0 

0 

830 

198 

81 

37 

.6 

0 

0 

0 

768 

174 

66 

.7 

0 

0 

0 

0 

637 

132 

.8 

0 

0 

0 

0 

0 

435 
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Minimum  Sample  Size  for  Detecting  Synergy 


Table  III 

Signific^ce  Level  *01  Power  *80 


.4 

.5 

.6 

.7 

.8 

.9 

p  =p^\^ 

X  Y 

.3 

857 

219 

97 

51 

30 

18 

.4 

0 

940 

232 

99 

51 

28 

.5 

0 

0 

948 

227 

91 

43 

.6 

0 

0 

0 

877 

199 

74 

•  7 

0 

0 

0 

0 

727 

149 

.8 

0 

0 

0 

0 

0 

496 

Table 

IV 

Significance  Level  . 

01 

Power 

.90 

.4 

.5 

.6 

.7 

.8 

.9 

P  =P^\  ^ 

X  Y 

•  3 

1113 

284 

126 

68 

39 

23 

.4 

0 

1223 

301 

129 

66 

35 

•  5 

0 

0 

1232 

293 

119 

57 

.6 

0 

0 

0 

1139 

258 

95 

.7 

0 

0 

0 

0 

944 

194 

•  8 

0 

0 

0 

0 

0 

645 
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ABSTRACT.  The  problem  of  selecting  the  best  out  of  several  treat¬ 
ments  with  dichotomous  responses  is  considered  in  the  framework  of  the 
Bechhofer  sequential  selection  model  with  enqihasis  on  minimizing  the 
number  of  patients  assigned  to  the  inferior  treatments.  Adaptive  sampling 
:rales  are  proposed  for  the  situations  where  the  response  to  the  treatments 
is  delayed  or  where  several  patients  have  to  be  scheduled  at  each  stage. 
Protocols  which  employ  the  new  sampling  rules  with  various  termination 
roles  considered  in  the  literature  are  shown  to  be  superior  or  comparable 
to  those  which  anploy  the  familiar  Vector-at-a-Time  or  Play-the-Winner 
sampling  rule  in  terms  of  the  average  sample  number  and  the  inferior 
treatment  number. 

1.  INTRODUCTION  AND  DEFINITION  OF  SAMPLING  RULES.  Let  ...  ,n. 

JL  M  Jv 

be  k  (k  >  2)  binomial  populations  with  respective  unknown  probabilities  of 
success  Pj,P2,...,Pjj  where  p^  >  Pj^  for  i  «  2,3,  ...,k.  The  problem  of 

identifying  the  population  with  the  largest  probability  of  success,  the 
•best'  population,  has  been  extensively  studied  in  the  literature.  In 
this  paper  we  are  mainly  concerned  with  the  sequential  selection  model 
for  this  problem  as  formulated  by  Bechhofer  (1958)  and  Bechhofer,  Kiefer 
and  Sobel  (1968),  and  adopted  by  Sobel  and  Weiss  (1970)  to  the  problem  of 
clinical  trials  where  several  treatments  with  dichotomous  responses  are 
being  coiiq>ared. 

The  Bechhofer  model  assumes  sequential  sampling,  and  consists  of  a 
sampling  rule  which  specifies  the  population  to  be  sa]iq>led  at  any  given 
stage  and  a  termination  rule  which  directs  when  to  stop  sampling  and  how  to 
make  the  final  choice  of  the  best  population.  The  selection  is  to  be  made 
subject  to  the  P*,A*  -admissibility  requirement  on  the  probability  of 
correct  selection  (CS)  that 

P(CS)  >  P*  for  P2-nax{p2,P3,...,Pj^)  >  A*  (1) 

where  P*  (  ^  <  P*  <  1)  and  A*  (0  <  A*<  1)  are  prespecified  constants. 
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In  the  context  of  clinical  trials  the  Bechhofer  model  provides 
admissible  protocols  which  assign  patients  to  the  treatments  sequentially 
in  time,  one  or  more  at  each  stage,  until  the  best  treatment  is  identi¬ 
fied  with  a  specified  probability.  A*  can  be  interpreted  as  the  medi¬ 
cally  significant  or  detectible  difference.  For  specified  P*  and  A*, 
choice  among  the  various  possible  admissible  protocols  is  usually  made 
on  the  basis  of  the  (random)  number  of  patients  assigned  to  treatment 

i  (i  =  l,2,.,.,k)  and  the  total  number  N  of  patients  needed  to  reach  a 
decision.  More  specifically,  Sobel  and  Weiss  (1970,  1972)  base  their 
comparisons  on  the  loss  functions 

k  k 

E(N)  =  I  E(N  ),  I  E(N.)  (2) 

i»l  ^  i=2  ^ 

and  the  risk 

k 

E(L)  =  I  (Pi-p.)E(N.) 
i=2 

the  last  two  measures  being  given  more  importance  for  obvious  ethical 
reasons. 

It  is  convenient  at  this  point  to  specialize  our  discussion  to  the 
case  when  k  =  2;  a  major  portion  of  this  paper  as  well  as  most  of  the  past 
work  in  this  area  is  confined  to  the  comparison  of  two  treatments.  The 
admissibility  condition  (1)  now  reads 

PCCS)  >  P*  for  A  =  P1-P2  >  A*,  (3) 

and  the  loss  functions  of  interest,  given  in  (2),  become  E(N),  known  as  the 
Average  Sample  Number  (ASN) ,  and  E(N2) ,  the  Inferior  Treatment  Number  (ITN) 

Most  of  the  protocols  considered  so  far  in  the  literature  fall  into 
two  broad  classes  depending  on  the  sampling  rule  employed.  The  older  and 
more  familiar  sampling  rule  is  the  so-called  Vector-at-a-Time  (VT)  rule 
which  assigns  patients  to  both  of  the  two  treatments  at  each  stage,  one  to 
each  treatment  randomly,  until  a  selection  is  made  based  on  the  termination 
rule.  An  essentially  equivalent  way  of  implementing  the  VT  rule  is  to 
assign  the  first  patient  to  one  of  the  two  treatments  at  random  and  then  to 
alternate  the  treatments  given  to  the  subsequent  patients  as  they  arrive. 

It  is  readily  seen  that  in  any  protocol  which  employs  the  VT  rule,  regard¬ 
less  of  the  termination  rule  used,  we  have  ECN^  =  E(N2)  =  E(N)/2. 

Since  one  of  the  basic  aims  of  a  clinical  trial  is  to  reduce  the  ITN 
it  was  suggested  by  Zelen  (1969)  that  sampling  be  done  according  to  the  so- 
called  Play-the-Winner  (PW)  rule  instead  of  the  VT  rule.  The  PW  rule  was 
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originally  studied  by  Robbins  (1956)  as  a  data- dependent  policy  for  the 
two -armed  bandit  problem.  According  to  this  rule  the  first  patient  to 
arrive  is  given  one  of  the  two  treatments  chosen  at  random.  The  ith 
patient  (i  =  2,3,....)  is  given  treatment  1  (treatment  2)  if  the  (i-l)th 
patient  received  treatment  1  (treatment  2)  and  it  succeeded  or  if  the 
(i-1)^  patient  received  treatment  2  (treatment  1)  and  it  resulted  in  a 
failure.  Zelen  investigated  the  performance  of  the  PW  sampling  rule  in 
the  Ans combe- Col ton  model  (Anscombe,  1963;  Colton,  1963)  for  clinical 
trials  and  showed  that  in  general  it  leads  to  a  significant  reduction  in 
the  number  of  patients  who  receive  the  inferior  treatment. 

Subsequently  Sobel  and  Weiss  (1970)  and  several  others  (See  Hoel, 

Sobel  and  Weiss,  1975  for  an  excellent  review)  have  shown  that  the  PW  rule 
is  superior  to  the  VT  rule  in  the  Bechhofer  model  in  terms  of  reducing 
both  the  ASN  and  ITN  for  fixed  P*  and  A*.  Most  of  the  emphasis  here  has 
been  on  devising  different  termination  rules  and  comparing  the  resulting 
protocols  with  the  already  existing  ones. 

Despite  its  poor  performance  in  terms  of  the  ASN  and  the  ITN,  the  VT 
sampling  rule  has  some  advantages  in  its  implementation  which  are  not 
shared  by  the  PW  rule.  For  example,  in  the  PW  rule,  the  allocation  of  any 
given  patient  to  a  treatment  depends  on  the  outcome  of  the  preceding  trial, 
and  hence  it  is  required  that  the  response  to  the  treatments  be  instanta¬ 
neous  or  that  the  response  be  available  by  the  time  a  new  patient  arrives; 
the  VT  rule,  on  the  other  hand,  is  applicable  in  situations  of  delayed 
response,  and  allows  for  the  treatment  of  several  patients  at  each  stage. 

One  of  the  purposes  of  the  present  paper  is  to  propose  and  study 
some  sampling  rules  which  are  applicable  in  situations  of  delayed  response. 
The  simplest  case  here  is  when  patients  arrive  twice  as  fast  as  the 
response  to  any  one  of  the  two  treatments  is  made  available.  This  is 
considered  in  Section  2.  The  Play-the-Clear-Winner  (PCW)  sampling  rule 
introduced  to  handle  this  case  is  defined  as  follows:  At  the  first  stage, 
the  first  two  patients  to  arrive  receive  treatments  1  and  2  respectively. 

At  any  given  stage  assignment  of  treatments  is  made  either  for  two 
patients  or  for  one  patient  depending  on  the  outcome  of  the  preceding 
stage.  At  the  ith  stage  (i  =  2,3,...)  treatments  1  and  2  are  assigned 
randomly  to  two  patients  if,  at  the  (i-1)^  stage,  either  (a)  treatments  1 
and  2  were  assigned  to  two  patients  and  they  both  resulted  in  a  success 
or  a  failure  or  (b)  treatment  1  or  2  was  assigned  to  one  patient  and  it 
resulted  in  a  failure.  At  the  ith  stage  (i  =  2,3,...)  treatment  1  (2) 
is  assigned  to  one  patient  if,  at  the  (i-1)^  stage,  either  (a)  treatments 
1  and  2  were  assigned  to  two  patients  and  treatment  1  (2)  resulted  in  a 
success  and  treatment  2  (1)  resulted  in  a  failure,  or  (b)  treatment  1  (2) 
was  assigned  to  one  patient  and  it  resulted  in  a  success. 

It  can  be  easily  verified  that  the  PCW  sampling  rule  is  equivalent  to 
the  following  rule:  the  first  two  patients  to  arrive  receive  treatments  1 
and  2  randomly.  The  ith  patient  (i  =  3,4,...)  to  arrive  is  given  treatment 
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1  (2)  if  the  (i-2)th  patient  either  (a)received  treatment  1  (2)  and  it^ 
resulted  in  a  success  or  (b)  received  treatment  2  (1)  and  it  resulted  in 
a  failure.  This  formulation  implies  that  the  PCW  rule  is  equivalent  to 
implementing  two  PW  rules  in  parallel,  one  starting  with  treatment  1  and 
the  other  with  treatment  2,  a  possible  solution  to  the  delayed  response 
case  suggested  by  Zelen  (1969) •  This  formulation  also  shows  that  the 
PCW  rule  is  applicable  in  situations  where  the  response  to  the  treatments 
is  instantaneous  but  two  patients  are  to  be  scheduled  to  receive  treat¬ 
ments  at  each  stage. 

The  performance  of  protocols  which  employ  the  PCW  sampling  rule  and 
various  termination  rules  considered  in  the  literature  in  connection  with 
the  PW  rule  is  summarized  in  Section  2.  Comparisons  with  the  corres¬ 
ponding  protocols  which  use  the  PW  and  the  VT  sampling  rules  are  also 
presented.  It  is  shown  that  the  PCW  rule  is  in  general  superior  to  the 
other  two  rules  in  the  sense  that  it  requires  comparable  or  smaller  ASN 
and  ITN  to  reach  a  decision  in  addition  to  its  greater  generality  over 
the  PW  rule.  Numerical  results  on  the  comparisons  are  presented  only  for 
P*  =  0.95  and  A*  =  0.2. 

The  formulation  of  the  PCW  rule  as  two  PW  rules  in  parallel  allows 
us  to  extend  it  to  situations  where  m  patients  are  to  be  scheduled  at 
each  stage  or  patients  arrive  m  times  as  fast  as  the  response  to  any  one 
of  the  two  treatments  is  made  available.  This  is  accomplished  by  simply 
implementing  m  PW  rules  in  parallel, [m/2]  starting  with  one  of  the  two 
treatments  chosen  at  random  and  the  remaining  starting  with  the  other 
treatment.  This  method  of  dealing  with  the  delayed-response  situations 
was  again  essentially  suggested  by  Zelen  (1969).  Section  3  deals  with 
this  rule  (denoted  PWP  for  Play-the-Winner-in-Parallel)  for  m  =  3.  In 
contrast  to  Section  2  only  a  very  limited  number  of  temination  rules  are 
considered  here.  Comparisons  in  terms  of  ASN  and  ITN  indicate  that  the 
behavior  of  the  PWP  rule  is  similar  to  that  of  the  PCW  rule  discussed  in 
Section  2. 

In  Section  4  we  return  to  the  problem  of  selecting  the  best  out  of 
k  (k  >  3)  binomial  populations.  The  generalization  of  the  VT  sampling  rule 
to  three  or  more  populations  is  straightforward.  All  of  the  k  populations 
are  sampled  at  each  stage.  Equivalently,  the  populations  are  randomly 
ordered  at  the  outset  and  are  sampled,  one  at  each  stage  according  to  this 
order,  sampling  returning  to  the  first  population  at  the  end  of  a  cycle. 

A  generalization  of  the  PW  rule,  called  the  Play-the-Winner-Cyclical  (PWC) 
sampling  rule,  appropriate  for  the  present  case  was  studied  by  Sobel  and 
Weiss  (1972) .  According  to  the  PWC  rule,  the  k  populations  are  randomly 
ordered  at  the  outset.  Sampling  starts  with  the  first  population.  At  the 
ith  stage  (i  =  2,3,...)  the  tth  population  (t  =  l,2,...,k)  is  sampled  if, 
at“the  (i-l)th  stage,  either  (a)  the  t A  population  was  sampled  and  it 
resulted  in  Tsnccess  or  (b)  the  (t-1)  th  population  (Ort  population  being 
identified  with  the  k;di)  was  sampled  and  it  resulted  in  a  failure.  Admissi¬ 
ble  protocols  involving  the  VT  and  the  PWC  sampling  rules  and  the  so-called 
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inverse  stopping  rule  were  compared  by  Sobel  and  Weiss  (1972)  using  the 
loss  functions  defined  earlier  in  this  section.  They  showed  that  the  PWC 
rule  was  uniformly  better  than  the  VT  rule  for  this  stopping  rule.  Except 
for  their  work  nothing  is  at  present  known  about  the  behavior  of  the  VT 
or  the  PWC  sampling  rule  for  other  termination  rules. 

A  natural  generalization  of  the  PCW  rule  to  k  populations  is  as 
follows:  Sanqple  all  k  populations  at  the  first  stage.  At  the  ij^  stage 
(i  =  2,3,...)  sample  only  those  populations  which  were  sampled  at  the 
(i-1)^  stage  and  resulted  in  a  success.  If  no  such  population  exists  at 
the  ith  stage,  then  sample  all  the  k  populations  again  and  continue  the 
process.  We  shall  refer  to  this  sampling  rule  also  as  the  PCW  rule,  and 
note  that  it  is  also  applicable  in  situations  where  patients  arrive  twice 
as  fast  as  the  response  to  the  treatments  becomes  available.  In  Section  4 
we  present  some  numerical  results  for  the  PCW  rule  for  k  =  3  with  the 
inverse  termination  rule  and  some  of  its  modifications  applicable  only  to 
the  VT  and  the  PCW  rules.  It  is  shown  that  with  inverse  termination  the 
PCW  and  the  PWC  rules  behave  more  or  less  identically  while  the  modified 
rules  lead  to  improved  protocols  when  employed  with  the  VT  or  the  PCW  rules. 

Throughout  this  paper  numerical  comparisons  of  the  protocols  are  given 
only  for  P*  =  0.95,  A*  =  0.2  and  a  limited  number  of  values  of  the  para¬ 
meters  Pi>P2> More  extensive  comparisons  as  well  as  the  analytical 

results  pertaining  to  the  protocols  will  be  presented  elsewhere. 

2.  TOE  PCW  SAMPLING  RULE  FOR  TWO  BINCMIAL  POPULATIONS.  In  this 
section  we  consider  several  termination  rules  proposed  in  the  literature  in 
connection  with  the  PW  sarapling  rule.  The  values  of  ASN  and  ITN  are  pre¬ 
sented  for  admissible  protocols  (P*  =  0.95^  A*  =  0.2)  which  employ  these 
termination  rules  and  the  VT,  PW  and  PCW  sampling  rules  for  A  =  CP.-V^y/2 
=0.2  and  p^  »  (Pj^+P2)/2  =  0(0. 1)0.9.  The  sample  sizes  correspon¬ 
ding  to  other  values  of  these  parameters  are  available  but  are  not  given 
here  since  the  comparisons  presented  here  reflect  the  general  performance 
of  the  protocols  quite  adequately.  Protocols  are  identified  throughout  by 
the  sampling  rule  and  the  termination  rule  employed.  For  example,  PCW3 
refers  to  the  protocol  which  uses  the  PCW  sampling  rule  and  Termination 
Rule  3.  Symbols  such  as  P(CS|PCW3),  E(N2|VT4)  and  E(N|PW1)  have  their 
obvious  meanings.  For  i  =  1,2,  the  cumulative  number  of  successes  and 
failures  on  H^,  at  any  given  stage  will  be  denoted  by  and  F^  respectively. 

Termination  Rule  1  (Sobel  and  Weiss,  1970).  Sampling  stops  as  soon  as 
IS2-S2I  =  r,  where  r  is  chosen  so  as  to  make  the  resulting  protocol  admissi¬ 
ble.  The  population  with  the  larger  number  of  successes  is  chosen  as  the 
better;  in  case  S^  =  S2,  the  better  population  is  chosen  at  random. 

For  given  P*  and  A*,  the  minimum  values  or  r  which  make  the  protocols 
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VTl  and  PWl  admissible  have  been  determined  by  Sobel  and  Weiss  (1970) . 

This  can  be  done  for  PCWl  using  a  similar  method.  For  P*  =  0.95  and 
A*  =  0.2,  these  are  given  by  r  =  4  for  VTl,  r  =  10  for  PWl  and  r  =  8  for 
PCWl.  Exact  expressions  for  the  ASN  and  ITN  of  VTl  and  PWl  are  also 
given  by  Sobel  and  Weiss  (1970).  Similar  expressions  can  be  obtained  for 
PCWl . 

Termination  Rule  2  (Sobel  and  Weiss,  1971) .  Sampling  stops  as  soon 
as  either  S^^  or  Sj^r  both)  equals  r  where  r  is  preassignted  to  make  the 

protocols  admissible.  The  population  which  achieves  r  successes  first  is 
declared  the  bettet.  If  both  achieve  r  successes  simultaneously,  then  the 
better  population  is  selected  at  random. 

It  can  be  shown  that,  for  all  Pj^,  P2>  P(CS|VT2)  =  P(CS|PW2)  = 

P(CS|PCW2).  Hence  the  same  value  of  r  would  make  all  these  three  protocols 
admissible;  r  equals  20  for  P*  =  0.95  and  A*  =  0.2.  Sobel  and  Weiss  (1971) 
have  shown  that  E(N1PW2)  <  E(n1vT2)  and  E(N2|PW2)  <  E(N2|VT2)  uniformly  in 

P-  and  P-.  These  inequalities  can  be  shown  to  hold  with  PW2  replaced  by 

PCW2. 

The  following  termination  rule  is  a  modification  of  Termination  Rule  2, 
and  is  applicable  to  the  PCW  and  the  VT  sampling  rules  but  not  to  the  PW 
rule.  It  is  defined  in  terms  of  the  cumulative  number  of  'clear  successes', 

S?  on  (i  =  1,2),  defined  by  S?  =  -  (the  number  of  times  and  n2 

were  sampled  together  and  they  both  succeeded) . 

c  c 

Termination  Rule  3.  Sampling  stops  as  soon  as  either  Sj^  or  S2  (or 

both)  equal  r.  The  population  with  the  larger  total  number  of  successes  is 
chosen  as  the  better.  If  S^  =  S2,  then  the  better  population  is  chosen  at 
random. 

For  P*  =  0.95  and  A*  =  0.2,  the  r  value  which  makes  the  protocol 
admissible  equals  12  for  PCW3  and  9  for  VT3. 

The  next  termination  rule,  originally  studied  by  Hoel  (1972)  for  the 
PW  sampling  rule,  is  based  on  the  statistics  =  Sj^  +  F2  and  R2  =  S2  +  F^. 

Termination  Rule  4.  Sampling  stops  as  soon  as  either  Rj^  or  R2  reaches 
a  preassigned  value  r,  and  the  population  is  selected  as  the  better  if 
R.  reaches  r  first  for  i  =  1,2.  With  the  PCW  and  the  VT  sampling  rules, 
r  +.1  may  be  reached  before  stopping.  If  both  R^  and  R2  reach  simultaneous¬ 
ly,  as  is  possible  with  the  PCW  and  the  VT  rules,  the  better  population  is 
selected  at  random. 
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It  can  be  shown  that  PCCS|PCW4)  =  PCGS|PW4j  .  Hence,  as  in  the  case  of 
Termination  Rule  2,  the  same  value  of  r  would  make  both  of  these  protocols 
admissible.  For  P*  =  0.95  and  A*  =  0.2,  the  minimum  value  of  r  equals 
33  for  PGW4  and  PW4,  and  29  for  VT4. 

Termination  Rule  5  (Fushimi,  1973).  Sampling  stops  as  soon  as  either 
I S J-S2T  =  r  F^  +  F2  =  s.  The  population  with  the  larger  number  of 
successes  is  chosen  as  the  better,  and  in  case  “  ^2'  better  popula¬ 

tion  is  chosen  at  random. 

For  any  given  P*  and  A*  there  are  in  general  several  values  of  ‘the 
pair  C3^,s)  which  would  make  the  protocols  VT5,  PW5  and  PGW5  admissible. 
Fushimi  (1973)  shows  how  the  *best^  pair  can  be  obtained  for  PW5  using  the 
property  that,  as  s  tends  to  »,  the  present  termination  rule  reduces  to 
Termination  Rule  1  and,  as  r  tends  to  «>,  it  reduces  to  Termination  Rule  2. 

The  ’best’  choice  of  Cr,s)  corresponding  to  PGW5  can  also  be  determined 
along  the  same  lines. 

Termination  Rule  6  (Nordbrock,  1975).  Sampling  stops  as  soon  as  either 

A  A  S  A  S. 

|s.  -  S«1  =  r  or  1p^  -  p^I  >  -  where  p.  =  - ;  the  population 

1  2  - 

with  the  larger  number  of  successes  is  chosen  as  the  better,  and  in  case 
Si  =  S2,  the  better  population  is  chosen  at  random. 

The  remarks  made  in  connection  with  Termination  Rule  5  regarding  the 
choice  of  (r,s)  apply  here  as  well.  (r,s)  equals  (8,4.2)  for  PGW6,  (11,4.2) 
for  PW6  and  (4,3.8)  for  VT6  when  P*  =  0.95  and  A*  =  0.2. 

Table  1  summarizes  our  results  on  the  ASN  and  the  ITN  of  the  protocols 
introduced  above  for  P*  =  0.95,  A*  =  A  =  0.2  and  p^  =  0.1(0.1)0.9.  As 

mentioned  earlier,  the  overall  behavior  of  the  protocols  is  adequately 
reflected  by  the  results  of  this  table.  It  can  be  seen  that,  except  for  a 
few  exceptions  (for  example,  for  values  of  p^  very  close  to  1),  the  PGW  rule 

requires  comparable  or  smaller  sample  sizes  when  compared  to  the  VT  or  the 
PW  rule.  The  increased  generality  of  the  VT  sampling  rule  over  the  PGW  rule, 
and  that  of  the  latter  over  the  PW  rule  should  also  be  kept  in  mind  when 
comparing  these  protocols. 

3.  THE  PWP  SAMPLING  RULE  FOR  TWO  BINOMIAL  POPULATIONS.  The  PWP 
sampling  rule  is  considered  here  for  Termination  Rules  2  and  5  of  the  previ¬ 
ous  section.  For  P*  =  0.95  and  A*  =  0.2,  r  =  20  for  PWP2,  and  (r,s)  =  (8,41) 
for  PWP5.  Table  2  gives  the  sample  sizes  for  these  two  protocols  corres¬ 
ponding  to  the  same  values  of  the  parameters  as  in  Table  1.  It  can  be  seen 
that  the  behavior  of  the  PWP  sampling  rule  is  quite  similar  to  that  of  the 
PGW  rule. 
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4.  THE  PCW  SAMPLING  RULE  FOR  THREE  BINOMIAL  POPULATIONS.  The  PCW 
saaipling  rule  for  three  binomial  populations  is  considered  here  with 
Termination  Rule  2  defined  in  Section  2,  and  two  of  its  modifications 
applicable  only  to  the  PCW  and  the  VT  sampling  rules.  The  protocol 
PWC2  has  been  studied  by  Sobel  and  Weiss  (1972) .  Closed  form  expressions 
for  P(CS|PCW2)  and  ECN^|PCW2),  i  =  1,2,3,  can  be  obtained  using  the  method 

of  Sobel  and  Weiss  (1972) .  Numerical  results  on  the  probabilities  of 
correct  selection  for  various  values  of  the  parameters  indicate  that,  as 
in  the  case  of  two  populations,  P(CS|PCW2)  =  PCCS|PWC2)  even  though  we 
have  not  been  able  to  establish  this.  For  P*  =  0.95  and  A*  =  0.2,  the 
common  value  of  r  which  makes  the  protocols  PCW2  and  PWC2  admissible  is  28. 

The  modifications  of  Termination  Rule  2  which  we  consider  are  quite 
similar  to  Termination  Rule  of  Section  2  in  that  they  are  obtained  by 
defining  ‘clear  successes*  appropriately.  In  the  first  modification. 
Termination  Rule  3‘,  we  define  T^  =  (number  of  times  all  three  populations 

were  sampled  and  either  n  and  IT  or  n  and  II  succeeded  and  the  other 

X  ^  X  o 

failed)  +  (number  of  times  and  112  ^1  ^3  sampled  and 

succeeded  and  the  other  failed)  +  2 (number  of  times  all  three  populations 
were  sampled  and  n  alone  succeeded),  and  T  and  T  symmetrically.  Termi- 

X  ^  o 

nation  Rule  3*  is  then  obtained  from  Termination  Rule  2  by  simply  replacing 
Si  by  T^  for  i  =  1,2,3*.  Similarly,  Termination  Rule  3”  is  obtained  from 

Termination  Rule  2  by  replacing  S^  by  U^^  for  i  =  1,2,3,  where  U^  =  (number 

of  times  all  three  populations  were  sampled  and  either  11^  and  or 

and  Hj  succeeded  and  the  "other  failed)  +  (number  of  times  and  II2  or 

and  were  sampled  and  they  both  succeeded)  +  2 {(number  of  times  all 

three  populations  were  sampled  and  alone  succeeded)  +  (number  of  times 

and  II2  sampled  and  alone  succeeded)  +  (number  of 

times  alone  was  sampled  and  it  succeeded)],  and  U^  and  U^  are  analagous- 

ly  defined.  The  r  values  which  make  the  Termination  Rules  3*  and  3” 
admissible  for  P*  =  0.95  and  A*  =  0.2  are  respectively  24  and  37. 

Table  3  summarizes  the  expected  sample  sizes  for  the  protocols  of  this 
section  for  selected  values  of  the  parameters.  As  in  the  case  of  Tables 
1  and  2,  more  extensive  comparisons  are  available  but  are  not  presented. 

It  is  clear  from  Table  3  that  PCW3‘  is  to  be  preferred  over  the  others. 
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TABLE  1.  EXPECTED  SAMPLE  SIZES  FOR  THE  PROTOCOLS  OF  SECTION  2 
FOR  P*  =  0.95  AND  A  =  A*  =  0.2. 


E(N2) 


PCWl  Pffl  VTl 


-PCWl  PWl  VTl 


40.5 

35.7 

30.9 
26.0 

20.9 

15.8 

11.0 

6.5 

2.2 


20.0 

19.8 

19.2 
18.7 

18.2 

18.7 
19.2 

19.8 

20.0 


91.0 

81.5 

71.9 

61.9 

51.5 

41.2 

31.2 
22.0 
13.4 


40.0 

39.6 

38.5 

37.4 

37.0 

37.4 

38.5 

39.6 
40.0 


PCW2  PW2  VT2 


81.0 

52.9 

38.6 

29.6 

23.3 

18.4 
14.1 

9.9 

5.0 


80.5 

52.4 

38.1 

29.1 
22.8 
17.8 

13.4 
8.8 
2.5 


100.0 

66.7 

50.0 

39.9 

33.2 
28.4 

24.9 

22.2 
20.0 


181.0 

119.5 

88.5 

69.4 

56.4 

46.6 

38.8 

31.8 

24.9 


180.5  200.0 

119.1  133.4 

88.0  100. P 

69.0  79.8 


49.0 

34.7 

27.2 

22.2 

18,5 

15.4 

12.5 
j  .3 
5.0 


45.0 

53.0 

28.0 

25.4 

24.7 

25.4 

28.0 

33.0 

45.0 


109.0 

77.9 
61.8 

51.5 

44.1 

38.2 

32.9 

27.6 

21.0 


90.0 

66.6 

56.0 

50.8 

49.4 

50.8 

56.0 

66.6 

90.0 
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TABLE  1 .  (Continued) 


^0 

E(N) 

PCW4 

PW4 

VT4 

PCW4 

PW4 

VT4 

moM 

26.6 

26.5 

24.3 

59.4 

59.0 

48.6 

mSB 

25.9 

25.8 

24.3 

58.6 

58.3 

48.6 

Bo 

25.1 

24.9 

24.3 

57.7 

57.3 

48.6 

BB 

24.0 

23.8 

24.2 

56.5 

56.2 

48.4 

22.6 

22.3 

24.2 

54.7 

48.4 

0.6 

20.3 

24.2 

52.9 

52.6 

48.4 

0.7 

17.9 

17.3 

.24.3 

50.1 

49.6 

48.6 

13.3 

12.4 

24.3 

45.5 

44.8 

48.6 

HH 

2.5 

24.3 

35.0 

48^6 

PCW5 

PW5 

VT5 

PCW5 

PW5 

VT5 

0.1 

20.3 

20.4 

19.7 

45.2 

45.9 

39.4 

0.2 

20.5 

22.0 

19.9 

46.2 

50.1 

39.8 

0.3 

20.0 

22.7 

20.4 

45.6 

52.6 

40.8 

0.4 

18.8 

22.2 

21.2 

43.6 

52.7 

42.4 

0.5 

16.8 

20.3 

22.2 

39.9 

49.8 

44.4 

0.6 

14.0 

16.9 

23.3 

34.4 

43.5 

46.6 

0.7 

10.8 

12.2 

24.4 

28.0 

34.1 

48.8 

0.8 

7.8 

7.1 

24.9 

22.0 

23.9 

49.8 

0.9 

5.0 

2.3 

25.0 

17.0 

14.2 

50.0 

PCW6 

PW6 

VT6 

PCW6 

PW6 

VT6 

0.1 

13.4 

13.5 

14.1 

29.1 

29.8 

28.2 

0.2 

13.7 

13.9 

14.8 

30.1 

31.1 

29.6 

0.3 

14.2 

14.6 

14.4 

31.9 

33.3 

28.9 

0.4 

15.6 

16.2 

16.0 

35.6 

37.9 

32.0 

0.5 

15.7 

17.8 

17.5 

37.1 

43.2 

35.0 

0.6 

13.9 

16.7 

18.7 

34.2 

41.5 

37.3 

0.7 

10.8 

12.0 

19.3 

27.9 

33.7 

38.6 

7.7 

7.1 

19.9 

21.8 

23.9 

39.9 

0.9 

4.9 

2.4 

19.9 

16.9 

14.5 

39.8 
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TABLE  2.  EXPECTED  SAMPLE  SIZES  TOR  THE  PROTOCOLS  OF  SECTION  5 
FOR  P*  =  0.95  AND  A  =  A*  =  0.2 


Po 

E(N2lPWP2) 

.  ECN|PWP2) 

E(N2|PWP5) 

E(N1pWP5) 

p.l 

81.1 

166.4 

20.2 

45.3 

0.2 

53.0 

113.1 

20.6 

46.6 

0.3 

38.7 

85.7 

20.2 

46.3 

0.4 

29.8 

68.5 

19.1 

44.5 

0.5 

23.5 

56.4 

17.3 

41.2 

0.6 

18.7 

47.2 

14.7 

36.2 

0.7 

14.6 

39.8 

11.8 

30.3 

0.8 

10.7 

33.4 

9.0 

24.9 

0.9 

6.9 

27.6 

6.7 

20.5 

TABLE  3. 

EXPECTED 

SAMPLE  SIZES 

FOR  THE 

PROTOCOLS  OF 

SECTION  4 

FOR  P* 

=  0.95 

AND  A*  = 

0.2 

^1 

P2=P 

1 

3 

E(Ni3 

ECNj)" 

■ECK3) 

PWC2 

PCW2 

PCW3* 

PCW3" 

PWC2 

PCW2 

PCW3* 

PCW3" 

0.2 

0 

140.0 

140.0 

67.3 

95.0 

112.3 

113.0 

54.9 

77.0 

0.3 

0.1 

93.3 

93.3 

51.6 

67.4  . 

73.0 

73.6 

41.2 

53.5 

0.4 

0.2 

70.0 

70.0 

44.4 

54.0 

52.9 

53.6 

34.4 

41.7 

0.5 

0.3 

55.9 

55.9 

40.7 

45.6 

40.4 

41.1 

30.3 

33.8 

0.6 

0.4 

46.4 

46.4 

38.8 

39.7 

31.5 

32.2 

27.2 

27.9 

0.7 

0.5 

39.6 

39.7 

37.9 

35.1 

24.4 

25.2 

24.4 

,  22.8 

0.8 

0.6 

34.6 

34.6 

37.2 

31.2 

18.1 

19.1 

20.8 

17.8 

0.9 

0.7 

30.8 

30.8 

35.4 

27.4 

11.3 

12.9 

14.9 

12.2 

1.0 

0.8 

27.9 

28.0 

29.2 

22.8 

1.7 

5.0 

5.0 

5.p 
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PREDICTIVISM  AND  SAMPLE  REUSE 


Seymour  Geisser 

School  of  Statistics 
University  of  Minnesota 


ABSTRACT,  This  paper  emphasizes  the  paramount  importance  of  prediction 
as  opposed  to  estimation  and  reviews  a  variety  of  general  structures  for 
implementing  the  predictivistic  outlook.  It  also  stresses  in  particular  the 
newly  devised  predictive  sample  reuse  method  as  a  highly  flexible  and  versa¬ 
tile  tool  in  low  structure  situations.  An  illustration  is  given  to  a  simple 
survival  situation. 

1,  INTRODUCTION.  The  fundamental  thesis  of  this  paper  is  that  the 
inferential  emphasis  of  Statistics,  theory  and  concomitant  methodology,  has 
been  misplaced.  By  this  is  meant  that  the  preponderance  of  statistical 
analyses  deals  with  problems  which  involve  inferential  statements  concerning 
parameters .  The  view  proposed  here  is  that  this  stress  should  be  diverted 
to  statements  about  observables.  With  regard  to  parameters  we  take  the 
narrow  view  which  relegates  them  at  most  to  be  components  of  a  statistical 
model  that  are  not  capable  of  being  observed  or  potentially  observed.  This 
is  not  necessarily  to  deny  them  their  utility  in  many  hypothetical  frame¬ 
works  but  there  has  been  a  strong  tendency  to  exaggerate  their  importance  in 
statistical  inference.  Even  such  a  compelling  "parameter"  as  the  speed  of 
light  is  in  some  sense  ostensibly  capable  of  being  measured  (observed)  though 
perhaps  subject  to  error.  In  this  sense  it  is  at  least  a  potentially 
observable  entity.  Other  values  which  often  are  misdesignated  as  parameters 
are  those  defined  as  a  function  of  a  finite  number  of  observables  or  poten¬ 
tial  observables  which  typically  occur  in  sample  survey  situations.  For  ex¬ 
ample  we  may  be  trying  to  "estimate"  the  total  response  of  a  specific  finite 
population  by  observing  some  random  portion  of  that  population.  The  unobserved 
responses  are  presumably  potentially  observable  (or  the  randomization  is  mean¬ 
ingless)  and  it  is  maintained  that  we  are  basically  predicting  them  or  some 
function  of  them.  This  is  certainly  within  the  realm  of  prediction  though  it 
is  generally  referred  to  as  estimating  a  parameter  of  a  finite  population. 

Hence  these  two  previously  mentioned  cases,  measuring  some  physically  mean¬ 
ingful  constant  and  estimating  functions  of  observables  are  within  the  realm 
of  predictivism.  It  is  our  contention  that  in  other  cases  the  introduction 
of  a  convenient  parametric  statistical  model  seems  to  impel  statisticians  to 
reformulate  an  experimenter’s  often  imprecisely  framed  question  concerning 
the  data  into  a  parametric  analysis  even  when  the  parameters  are  completely 
artificial  constructs.  We  then  proceed  to  foist  upon  the  unwary  client 
"precise"  statements  about  these  too  often  nonexistent  entities.  This  ten¬ 
dency  is  reinforced  because  we  have  too  long  been  subjected  to  solutions  to 
hypothetical  problems  which  invariably  begin  —  "suppose  we  are  interested 
in  the  estimation  of  a  parametric  function  BLAH(9)."  This  stress  on  para¬ 
metric  inference  made  fashionable  by  mathematical  statisticians  has  been  not 

This  work  was  supported  in  part  by  U.S.  Armj"  Grant  DAHCOi+-7^-P~02l6. 
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only  a  comfortable  posture  but  also  a  secure  buttress  for  the  preservation 
of  the  high  esteem  enjoyed  by  applied  statisticians  because  exposure  by 
actual  observation  in  parametric  estimation  is  rendered  virtually  impossible. 
Of  course  those  who  opt  for  predictive  inference  i.e.  predicting  obser¬ 
vables  or  potential  observables  are  at  risk  in  that  their  predictions  can  be 
evaluated  to  a  large  extent  by  either  further  observation  or  by  a  sly  client 
withholding  a  random  portion  of  the  data  and  privately  assessing  a  statis¬ 
tician's  prediction  procedures  and  perhaps  concurrently  his  reputation. 
Therefore  much  may  be  at  stake  for  those  who  adopt  the  predictivistic  or 
observabilistic  or  aparametric  view.  But  its  relevance  is  clear. 

It  was  the  burden  of  a  previous  paper  Geisser  (l97l)  to  argue  that 
most  problems  currently  cast  in  terms  of  parametric  estimation  and  testing 
could  be  more  informatively  reformulated  in  a  predictivistic  mode.  A  general 
catalogue  of  such  problems  was  presented  there  and  the  Bayesian  inferential 
approach  stressed.  In  this  paper  we  shall  discuss  the  problem  of  prediction 
per  se  from  a  variety  of  structures  ranging  from  high  to  low  depending  upon 
the  amount  of  information  infused  into  the  model.  In  particular  we  will 
stress  a  new  low  structure  approach  termed  predictive  sample  reuse. 


2.  HIGH  STRUCTURE.  The  high  structure  approach  to  statistical  prediction 
involves  the  tight  apparatus  of  a  prior  distribution  for  the  parameters  invol¬ 
ving  known  hyper parameters  and  a  specified  likelihood,  i.e.  a  joint  sampling 
distribution  of  observables,  past  and  future,  as  it  were.  Hence  we  need  assume 

that  or  in  a  more  con?>act  notation  (x^^^ 


has  joint  distribution  where  0  is  a  set  of  unknown  para- 

meters.  Further,  a  prior  distribution  on  0,  say  G(9|t),  is  also  assumed 
where  the  set  of  hyperparameters  T  is  known.  The  posterior  distribution  of 

(n)  (n) 

0  is  then  based  on  the  observed  X'  '  =  x'  , 


(2.1) 

where 

F(x(®'^|T)=/F(x(“V)dG(9|T). 

This  then  oermits  the  calculation  of  the  predictive  distribution  of 
given  X^  ^  and  T,  resulting  in 

P(x(jj)1x(*’^t)  =/  F(x^jjjlx^*^^6)  d  G(e|x(“^T)  (2.3) 

where 
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F(x^”^|e) 


{2.h) 


The  denominator  of  the  above  being  the  marginal  sampling  distribution  of 

the  observed  random  variables  In  essence,  (2«3)  represents  the 

ultimate  in  statistical  prediction  and  everything  else  is  a  summary  of 
one  kind  or  another  of  this  distribution  function.  If  point  prediction 
is  of  interest  then  one  might  choose  as  a  point  predictor  the  predictive 
expectation  of  (2*3) 

t) 

or  the  median  or  the  mode  of  (2.3)  or  whatever  ensues  from  a,  particular 
loss  function. 


Often  in  this  approach  there  is  a  necessary  relaxation  of  the 
assumption  that  T  is  known.  This  is  generally  handled  in  one  of  two 
ways.  First  it  is  often  the  case  that  little  loss  in  terms  of  inco¬ 
herence  is  engendered  by  assuming  an  improper  prior  for  the  hyperpara- 
meter  T.  Hence  a  new  predictive  distribution  is  obtained  by 
calculating 


(M) 


(2.6) 


A  second  approach,  usually  associated  with  empirical  Bayes  procedures,  is 

to  “estimate"  T  from  the  marginal  distribution  f(x^^  |t)  given  in 
(2.2)  by  maximum  likelihood  or  the  method  of  moments  or  any  other  conveni¬ 
ent  procedure.  This  then  results  in  an  approximate  predictive  distribution 

p(x/„^|x^^\^)  and  a  point  predictor,  say,  E(x^j^ylx^*^\T) . 


(M) 


Historically  there  have  also  been  two  other  high  structure  approaches. 
The  first  by  Fisher  (I956)  was  termed  fiducial  inference  and  the  second 
Fraser  (I968)  termed  structural  inference.  These  generally  require  for 
their  implementation,  a  much  more  restrictive  sampling  distribution  and  an 
assumption  of  complete  ignorance  concerning  0  which  in  turn  implies  the 
absence  of  t.  Here  one  would  calculate  the  fiducial  or  structural  distri¬ 
bution  cp(0|x^^^)  and  then  compute  the  predictive  distribution  of 


(2.7) 


This  type  approach  is  at  most  valid  only  under  stringent  assumptions. 
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Many  statisticians  have  questioned  its  validity  entirely.  Recently  Barnard 
(1975)  developed  a  pivotal  approach  to  parametric  inference.  His 

approach,  as  demonstrated  by  Hinkley  (1975) »  easily  be  adapted  to  a 
predictivistic  mode  by  finding  predictive  pivots.  It  appears  also  to  be 
capable  of  incorporating  certain  types  of  prior  information. 


3 .  INTERMEDIATE  STRUCTURE, 
only  assumes 


The  classical  (Ne3nnan-Pears on)  approach 
i.e.  a  sampling  distribution 


and  enough  structure  on  the  distribution  so  that  one  can  compute,  independent 
of  0, 


Pr  X 


(M) 


EACxW)]  .p. 


(n)  (n) 

This  of  course  is  not  a  probability  statement  for  X  =  x  ,  as  in  the 
Bayes  approach.  Here  p  represents  the  degree  of  confidence  that 

p  being  a  valid  probability  in  the  sense  of  the  long-term 


X 


(M) 


e  A  (xC)). 


frequency  of  repetitions  from  the  joint  set  of  random  variables 
In  other  words,  p  is  the  proportion  of  times  in  the  long  run  that 
(N) 


(M) 


€  A 


X(m)  ^  ^  once  X 


and  is  interpreted  as  the  confidence  one  has  in 

(n)  ..(n)  - J  usually 


=  X 


has  been  observed. 


referred  to  as  a  tolerance  interval  in  the  statistical  literature.  For 
example,  if  we  are  dealing  with  the  problem  of  predicting  the  N  +  1  obser¬ 
vation  X„  .  ^  from  the  first  N  observations,  X- , . . .  ,X„  and  assume  that 


+  1 

{Xj^}  1  =  1,...,  N  +  1 

1  N 

X„  =  N"  .11  X. 

N  1 


are 


iid  N(9,1)  then  one  notes  that  for 


-1 


^N'^+l  -  ) 

From  (3.1)  we  obtain 


r  1  r_ 

PrLa^  j - —  ^  bj  =  Pr  [Xj 


„+aA+N"^  ^  X„,,  ^  X^+b'"'^ 
N  N+1  N 


-1 


(3.1) 


(3-2) 


1+N 

=  ?  (b)  -  i  (a)  =  p, 

where  $  (y)  is  the  standard  normal  distribution  function. 

While  (3 *2)  is  a  probability  statement,  once  we  observe  X^^  =  and 

calculate  the  limits ,  this  now  becomes  a  confidence  statement  and  has  only 
the  restricted  interpretation  discussed  before. 

A  point  predictor  is  usually  obtained  by  inserting  in  e(X^j^|x  0,) 
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an  estimate  9  for  0  -  the  expectation  being  taken  over  the  condi- 

tional  sampling  distribution. 

Another  approach,  having  its  roots  in  Fisher's  work  (I956),  termed 
predictive  likelihood ,  has  recently  been  independently  introduced  by  Hinkley 
(1975)  and  Lauritzen  Here  as  in  the  fiducial  approach,  sufficiency 

though  in  an  extended  sense,  plays  the  key  role.  It  is  assumed  that 

have  likelihood  9)  which  admits  a  totally 

sufficient  reduction  of  the  data.  In  the  case  of  independent  and  identically 
distributed  random  variables  a  minimal  sufficient  reduction  need  only  be 
available.  In  this  latter  case  as  pointed  out  by  Fisher  (1956),  a  minimal 
sufficient  statistic  is  a  function  of  the  individual  sufficient  statistics 
from  any  portion  of  the  entire  sample.  The  concept  of  a  totally  sufficient 
statistic  introduced  by  Lauritzen  (197^)  permits  extension  of  this  result 
to  the  more  general  case  of  dependence. 

Let  Sjj  =  and  s(x^^\x(jj^)  be  the  set  of  totally 

sufficient  statistics  for  9  based  on  the  random  variables  to  be  observed 
and  those  that  are  to  be  observed  and  predicted,  respectively.  Then  one  can 
obtain,  independent  of  0,  the  conditional  probability  function 

f(s(x^^^)ls(x^**'\x^j^j))  (3*3) 

which  is  now  defined  as  being  proportional  to  the  predictive  likelihood  i.e. 


)  cc  prlk  (x^*^'|x/  0. 


(3A) 


This  is  then  treated  as  is  the  usual  L(x|9)  where  now  takes  on  the 


role  of  0.  For  the  fixed  value  x'^S  the  predictive  likelihood  orders  the 
plausibility  for  various  values  X^j^^  =  For  a  simple  example,  consider 

X.,i=l,...,N+M  as  Bernoulli  iid  random  variables  where 

p(X.=l)  =  1-P(X.=0)  =  0.  If  r  out  of  the  first  N  are  I's,  we  can  order 

possible  predictive  values  for  the  number  of  I's,  say  t,  in  the  next  M 

N  M 

trials.  Defining  R=i:x,  T=S  which  are  sufficient,  we  can 

i=l  ^  i=l 

compute  in  a  simple  fashion 


R+T  =  r+t  = 


(”)  (?) 

/N  +  Mv 
+  t' 


oc  prlk  (r|  t) 


(3.5) 


which  is  used  to  order  the  plausible  values  for  t=0,.*.,M. 
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A  point  predictor  can  conceptually  be  obtained  by  maximizing  the 
predictive  likelihood.  In  the  case  where  M>  1  and  the  random  variables 
are  iid,  it  is  clear  that  prlk  will  have  multiple  maxima  due  to 

the  exchangeability  of  the  likelihood.  This  must  be  so  and  should  be  no 
cause  for  concern.  In  the  previous  exanple  though,  there  may  be  a  unique 
maxima  at  some  value  of  t  and  be  adequate  if  t  is  to  be  predicted.  It 
is  clear,  however,  that  if  the  individual  » •  •  • 

dieted  and  the  maximum  was  at  t  =  t^,  say,  then  every  partition  of 

X  ,...,x  into  t  I's  and  M-t  O's  would  also  yield  identical 

N+1  N+M  o  o 

maxima  of  the  prlk 

For  a  variety  of  interesting  applications  of  predictive  likelihood  to 
standard  statistical  situations,  the  reader  is  referred  to  Hinkley  (1975) • 


k.  LOW  STRUCTURE  AJND  ASSESSMENT.  Before  actually  discussing  techniques 
available  in  low  structure  situations  it  will  be  useful  to  review  a  very  old 
and  informal  method  of  considerable  value  in  comparing  point  predictors. 


Suppose  several  predictors  are  suggested  for  a  set  of  data,  then  a  fruitful 
comparison  of  them  may  be  accomplished  by  a  validation  technique.  The  sample 

x^^^  is  randomly  divided  into  two  parts  x^^  and 

x^’^^s  (x  x„)  called  the  construction  sample  and  the  validation 

sample  respectively.  Assume  also  that  associated  with  each  sample  point  x^ 

is  a  known  value  z. .  The  data  analyst  then  computes  the  competing  predictors 

^  .  A  /  (N-n)  (N-n).  \  __ 

from  the  construction  sample  obtaining,  say,  x..vx  ,Z  ,z.;  - 


X 


as  the  i  predictor  for  the  value  Xj  at  known  value 


j  =  N-n+l,...,  N;  i  = 

,  .  .  „(N-n) 

to  be  compared,  and  Z' 


where  K 
l’***’Vn 


represents  the  number  of  predictors 
).  First  the  residuals 


X  -  X  =  r  are  computed  and  then  the  empirical  distribution  functions  of 

ji  j 

residuals  are  plotted  for  each  predictor.  A  comparison  of  these  empirical 
distribution  functions  will  shed  much  light  in  determining  which  predictor  is 
most  appropriate.  Sometimes  when  the  validation  sample  is  not  very  large  a 
relevant  summary  measure  of  the  predictive  discrepancy  is  adequate  for  compari 
son.  For  example  we  might  compute  the  predictive  mean  squared  error 


1 

=  (N-n)"  Z)  r?.  i=l,...,K.  This  procedure  is  generally  useful  only 

^  j=N-n+l 

when  a  reasonably  large  number  of  observations  is  at  hand.  This  is  often  not 
the  case.  Also  the  procedure  seems  inefficient  in  that  it  does  not  extract  all 
of  the  information  in  the  data.  To  overcome  this  a  technique  which  is  referred 
to  as  simple  cross-validation  may  be  substituted. 
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obser- 


Let 


(N-l) 


—  (x, , • • •  ,x 


Z  -  (z  z 

Z.j  - 


>  •  •  • 


with  corresponding 
th 


j  ^  N-l--— j.l>  -j+i 

;.  -,.,.,z.)  be  the  data  set  with  the  j 
J+1  J 

vation  omitted.  Now  for  each  predictive  function  we  compute  the  predictor 
(N-l)  ,  (N-l) 


and  re- 


X..  =x..  2.)  for  the  omitted  observation  x. 

Ji  Ji  '  J  J  J  ^3 

peat  this  for  for  each  predictor  obtaining  r^^  =  x^^  -  x^. 

Similarly  as  in  the  validation  set  up,  we  are  in  a  position  to  compare  for 
each  predictor  its  empirical  distribution  function  or  a  relevant  summary 
measure  of  predictive  discrepancy.  However  in  the  case  of  simple  cross 
validation  we  have  N  residuals  for  each  predictor  instead  of  n  as  in  the 
validation  case.  One  caution  is  in  order  —  in  the  validation  case  the 
residuals  are  dependent  only  by  virture  of  the  same  predictive  function 
while  in  the  simple  cross-validation  some  further’  algebraic  dependence 
creeps  in  as  a  result  of  using  the  data  repetitively.  On  the  other  hand 
the  simple  cross-validation  assessment  uses  all  of  the  data  while  the  vali¬ 
dation  assessment  only  uses  a  sample  of  the  data.  Notwithstanding,  the 
cross -validatory  assessment  procedure  is  certainly  very  useful  for  the 
comparison  of  predictors  generated  from  various  structural  assumptions  as 
the  basic  dependence  is  the  same  for  all  of  them. 


However  there  are  situations  where  specification  of  a  particular 
sampling  distribution  and  the  resultant  predictor  based  on  such  assumptions 
may  be  fraught  with  peril.  When  a  particular  sampling  paradigm  becomes  diffi¬ 
cult  or  impossible  to  identify,  and  yet  prediction  is  necessary,  data  analytic 
techniques  based  on  minimal  assumptions  need  come  to  the  fore.  One  such 
technique,  termed  predictive  sample  reuse  (PSR) ,  Geisser  (l97ha,  1975^) 
cross -validatory  choice.  Stone  (l97^^)>  is  currently  a  leading  candidate  for 
a  satisfactory  resolution  of  this  low  structure  case.  It  may  also  be  of 
service  in  what  are  basically  higher  structure  situations  as  we  will  detail 
later.  First  of  all  the  PSR  method,  when  flexibly  used,  is  very  likely 
to  be  robust  for  a  variety  of  sampling  paradigms.  A  second  feature  is  that 
it  simulates  the  predictive  process  upon  itself  in  some  optimal  fashion  often 
using  some  structural  hints.  It  is  even  capable  in  one  of  its  manifestations 
of  comparing  a  variety  of  approaches.  Essentially  the  goal  is  to  predict  a 
future  observation  or  set  of  such,  or  some  function  of  them.  For  the  purposes 
of  this  exposition  we  shall  restrict  ourselves  to  a  single  future  observation 
with  a  form  arbitrarily  chosen  for  predicting  it  as 


X  =  x(x,Z,z;tt)  a  €  Q  (4.1) 

where  a  is  some  set  of  unknown  values,  X  =  (x^,...,Xjj)  represents  a  sample 

of  size  N  and  with  each  x^  is  associated  a  known  z^,  and  Z  =  (z^, . . .  ,2^^) . 

It  must  be  stressed  that  in  this  approach  a  is  not  a  platonic  ideal  nor  in 
any  sense  a  true  value  of  paramount  importance.  It  is  to  be  regarded  as  merely 

a  convenient  way  of  forming  a  predictive  function.  Let  P^^  represent  the 
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til 

i’'  partition  of  the  sample  N-n  retained  and  n  omitted  observations 
0  <  n  ^  M,  where  M  is  the  largest  integer  such  that  the  predictive  function 
(il-.l)  can  be  formed  with  N-M  observations.  More  precisely,  the  observa¬ 
tional  set  X  and  the  set  Z  with  which  it  is  associated  are  partitioned 
such  that 


p  (N-n)  ^ 

i  ^  ir  ’  ir  ’ 


LO 


XO 


(1^.2) 


th 


of  partitions  relevant  to  a 
.^(N-n)  ^(N-n) 

'  ir  ’  ir 


is  the  i''“  partition  belonging  to  a  set 

particular  schema  of  observational  omissions  where  Z^^  and 

/  \  /  \  xr  ir 

(X.  ,  represent  the  N-n  retained  and  n  omitted  data  sets,  respec¬ 

tively.  Let  the  total  number  of  such  partitions  be  P(N,  n,  P),  or  simply  P. 
The  specified  predictive  function  is  then  applied  to  the  retained  observations 
for  prediction  of  the  omitted  observations  for  each  partition  with  the  unknown 
set  of  values  a  estimated  by  means  of  optimizing  an  average  discrepancy 
measure,  say. 


D  (o)  =  P“V^  S  z!“^;  a))  (4.3) 

N,n'  '  XO  XO  '  ir  -  ir  ’  xo 


where  each  element  in  the  set  X.  '  is  the  form  of  the  predictive  function 

and  d  is  a  measure  of  the  discrepancy  of  the  set  of  values  X^^  from 

the  set  of  predicted  values  X^^^  for  given  a.  ^(a)  is  then  optimized 

with  respect  to  a  in  some  sense.  On  the  basis  that  this  leads  to  a 
solution  say,  tt,  we  obtain  the  predictor  x  =x(X,Z,z;6)  =  f. 


When  predictive  functions  are  to  be  compared  irrespective  of  their 
generation  one  can  use  a  cross -validatory  assessment.  For  a  given  discrepancy 

measure  we  could  consider  for  the  i^^  partition  the  set  of  retained  observa¬ 
tions  and  associated  values  Z^^  and  partition  this  into  two  sets 


xr 


Z^^  2n)  Z^^^).  From  this  reduced  set  of  N-n  observations 

'  Irr  ’  xrr  *  xro’  xro^ 

and  associated  values  we  would,  as  previously,  obtain  an  and  compute 

the  discrepancy  (not  necessarily  based  on  the  same  d  as  was  used  to  obtain 
the  predictor)  between  the  values  predicted  for  the  n  omitted  observations 
and  the  actual  observations  themselves.  Repeating  this  for  each  i  we  would 
then  compute  an  overall  discrepancy  measure 
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for  each  predictive  function.  This  measure  then  would  be  relevant  to 
assessing  either  different  predictive  functions  or  various  estimators 
of  a  in  terms  of  predictive  discrepancy  for  the  same  predictive 
functions.  We  also  note  that  comparisons  other  than  the  average 
* 

^  can  be  utilized,  e.g.,  empirical  distributions  of  the  discrepancy 

can  be  compared  for  several  predictors.  A  variety  of  applications  of 
PSR  can  be  found  in  the  following  papers,  Geisser  197^b,  1975^* 

1975b),  stone  (I97^a,  197^b).  Here  we  shall  only  present  one  such  very 
simple  application  involving  a  data  based  predictor  which  is  to  be 
combined  with  limited  prior  information.  Let  the  predictive  function  be 

f  =  a  h  (x)  +  (1-a)  g  0  <  a  <  1  (4.5) 

where  g  represents  a  prior  guess  at  the  value  to  be  predicted  and  h  (x) 
the  data  based  predictor.  We  shall  use  the  squared  discrepancy  measure, 
with  a  one-at-a-time  omission  schema  so  that 


.1  ^ 

„  ,(a)  =  N"  E  (ah  +  (l-a)  g-x  )■ 

j=l  ^  ^ 


(4.6) 


where  h.  is  of  the  form  h,  but  based  on  N-1  observations,  i.e.  x. 

J 

has  been  omitted.  Maximization  of  ^  (or)  with  respect  to  a  yields 
f  =:  h  if  a  ^  1 

=  g  if  S  <  0  (4.7) 

I  =  u  h  +  (l-a)g  otherwise 

where 


N 


a  = 


■iii(hi-g)(^rg). 


N 


(4.8) 


In  particular  if  h  =  x  then  for  s^  =  (N-1)  ^S(x.-x)^  and 

J  J  Q  ^ 


a  = 


t^  -  1 


t^  +  (N-1) 
=  0 


-1 


if  t^  >  1 


otherwise . 


(4.9) 
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This  procedure  has  the  property  that  if  the  sample  mean  is  within  one 
sample  standard  deviation  of  the  mean  from  the  prior  guess  g  one  uses  g 
otherwise  one  uses  the  linear  combination.  Further  as  the  distance  between 
the  sample  mean  and  g  increases  relative  to  the  sample  standard  deviation, 
greater  weight  is  attached  to  the  sample  mean.  Moreover  as  N  increases  the 
predictor  tends  asymptotically  to  the  sample  mean. 

In  many  applications  it  would  appear  that  observational  omissions  one- 
at-a-time  are  appropriate.  However  there  are  some  applications  where  this 
may  not  be  the  case.  This  point  and  others  involving  various  schemata  of 
omissions  and  choice  of  relevant  partitions  are  discussed  in  Geisser  (1975^)* 

There  have  also  been  various  attempts  to  extend  PSR  point  prediction 
to  sets,  intervals,  and  regions.  It  is  not  yet  clear  as  to  how  satisfactory 
any  of  these  methods  are.  Pertinent  references  are  Geisser  (197^1^)5  Hinkley 
(1975)?  Butler  and  Rothman  (1975)* 

AN  APPLICATION.  We  now  illustrate  how  some  of  the  previous  method¬ 
ology  might  be  applied  in  practice  to  what  may  be  termed  a  simple  survival 
situation.  Suppose  we  have  a  random  sample  X^,.,.,X^  on  an  exponential 

random  variable  X  whose  density  is 


f(x|n)  =  |j,  >  0,  X  >  0.  (5-1) 

Further  suppose  our  prior  objective  or  subjective  information  is  subsumed 
in  a  prior  density  for  p., 

p(u)  Y  >  0,  6  >  0.  (5.2) 


Here  p,  takes  the  place  of  0  in  the  high  structure  Bayesian  approach  and 
T  =  (6,y).  Our  interest  is  in  predicting  a  value  for  the  random 

(n) 

future  observation  given  the  previous  N  observations  ,  say. 

Then  the  predictive  density  for  X^^^^  is  easily  calculated  to  be 


=  (N  +  5)(Nx  +  y)*^^^/(Nx  +  y  + 


(5.3) 


z  >  0, 


where  x  is  the  sample  mean  and  p(p^| 
given  the  previous  N  observations  x 


) 


is  the  posterior  density  of  p, 
Hence  our  forecast  about 


involves  the  hyperparameters  y  and  5  which  enter  the  problem  via  the 
distribution  of  the  parameter  p,.  Before  any  observations  are  taken  one  can 
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the  predictive  (marginal)  density  of  the  generic  variable  X, 

f(x)  =  J*  f(x|ii)p(p,)d|j, 

=  6y^/(y  +  X  >  0. 

Hence  it  is  convenient  and  more  appropriate  from  the  predictive  view  to 
think  about  these  h3nperparameters  in  terms  of  predicting  X  before  any 
observations  are  taken  rather  than  in  how  they  modulate  the  assumed  prior 

distribution  of  \x.  Therefore,  prior  to  the  sample,  we  have 

. 

e(x)  =  y/(s  -  1)  =  g  ,  , 

(5.5) 

Var(x)  =  6y^/(6  -  2)(6  -  l)^  =  g^(l  +  a)/(l  -  a) 

1 

where  a  =  (6  ^  l)  • 

Clearly  Var(x)  exists  for  0  <  a  <  1,  and  e(x)  exists  for  a>0 
while  the  distribution  exists  for  all  a  ^  [-1,0].  Hence  if  one  could 
frame  his  prior  opinions  about  the  potentially  observable  values  of  X 
in  terms  of  its  expectation  and  variance  then  one  can  easily  execute 
the  whole  predictive  process  by  solving  for  the  appropriate  values  6 
and  Y  from  (5*5)  substituting  them  in  (5*3)* 

It  is  to  be  noted  that  (5-3)  (5*^)  were  obtained  from  (5-f) 

and  (5.2).  However,  for  the  predictivist  who  would  prefer  to  start  from 
(5.1)  and  (5.4)  in  terms  of  convenience  of  framirig  his  predictions  this 
is  somewhat  awkward.  Interestingly  enough  in  this  case  starting  with 
f(x[|a)  and  f(x)  is  sufficient  to  obtain  p(h)  which  is 

a  more  logical  and  appealing  approach  for  the  predictivist.  This  is 

possible  here  because  f(x)  is  the  unique  Laplace  transform  of  |i  -  p(|Jl). 

Now  as  we  mentioned  previously  positing  all  of  these  assumptions 
yields  the  requisite  information  for  making  probability  statements  about 
a  future  value  provided  that  one  has  specified  values  for  g  and  a.  How¬ 
ever  while  one  may  often  be  willing  to  hazard  a  guess  at  g,  one  may  be 
far  less  willing  to  specify  a  value  for  a*  So  in  further  analysis  of  this 
problem  we  may  be  in  a  position  such  that  some  of  the  parameters  of  r  are 
assumed  known  and  others  unknown.  Assume  then  that  g  is  known  but  not  ot. 

One  approach  for  estimating  a  or  6  is  from  the  marginal  density 

f(xj^,...,Xjj|6,Y)  =  J*  f(x^,...,Xjj||j,)p(p,l  8,y)  d|i, 

=  r(N+6)  Y^ 

r(6)  [nx+y]*^^^ 


(5.6) 


also  find 
namely 
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Since  we  assume  g  =  ^  1  known  we  let  =  g  and  obtain  for 

N 

Ny  =  S  y. 
i=l  ^ 


r(N+6)  (6-1)^ 
r(6)  [Ny+e-l]^"^® 


(5.7) 


N 

Clearly  Y^=S  is  sufficient  for  6  in  the  above  likelihood.  The 
density  of  S  is  then  easily  obtained  to  be 


3|6)  _ - 

r(N)  r(6)  (s+6-i)^'*' 


(5.8) 


which  implies  that  oS  ~  Pa  N,  6)  a  Beta  distribution  of  the  second 

kind.  The  method  of  moments  essentially  fails  here  to  yield  a  sensible 
estimate  e.g.  E  (s)  =  N,  which  is  uninformative  relative  to  6  or  a. 
Use  of  higher  moments  tends  to  restrict  the  range  of  S  and  renders  it 
unreasonable  as  an  estimator.  The  reason  that  moment  estimators  are 
basically  inappropriate  here  is  that  they  assume  the  existence  of  the 
moments  -used  and  hence  tend  to  presume  a  restriction  on  the  range  of  6, 
whose  restriction  on  the  outset  is  6  >  1.  One  can  use  however  maximum 
likelihood  estimation.  Hence  we  calculate 


dlogf 

a  6 


=log- 


5-1 

s+6-1 


1  N+S 

N-1-H6  "  s+8-1 


(5.9) 


and  one  would  have  to  find  by  one  means  or  another  8  satisfying  ^  g  =0. 

An  explicit  solution  for  8  seems  impossible  to  achieve.  One  can  approximate 
(5.9)  by  using  the  Euler -Mac lauren  sum  formula  so  that  we  obtain  for  large  N 


Slogf  i 
d  5  " 


671  - 


N+6 

s+6-1 


N+6  1  1 

s+6-1  26  ■  2(6+n) 


(5.10) 


This  is  still  quite  formidable  and  when  set  equal  to  zero  still  does  not  yield 
an  explicit  solution  for  8. 


We  now  show  how  PSR  may  be  of  service  even  in  this  high  structure 
situation.  Suppose  we  were  to  predict  a  single  value  from  (5*3) 

using  the  predictive  mean 

e(Xjj^i|x  =  x)  =  (oNx  +  g)/(a  N+l).  (5.11) 


Apply  the  PSR  method  for  the  estimation  of  a  using  (5.II)  as  a 
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predictive  function  and  squared  discrepancy  with  one-at-a-time  omission 
schema  so  that 


(tt)  = 


( 

j=l 


a(N-l)xj+g 

a(N-l)+l 


(5.12) 


where  x.  is  the  mean  of  the  observation  with  x.  omitted.  Minimization 
J  J 

of  with  respect  to  a  yields 


A 

tt  =  0 


for  t^  >  1 
for  t^  ^  1 


(5.13) 


where  t^ 


N(g-x)^/s^  and 


N 


S  (x.-x)^. 
i=l  ^ 


Hence  PSR  may  be  used 


to  generate  estimates  even  in  the  high  structure  case.  On  the  other  hand 
using  (5.11)  and  (5. 12)  as  a  predictive  function  and  discrepancy  measure 
respectively  yields  a  PSR  predictor 


^N+l  =  (“  +  g)/(“  N+l) 


(5.1i^) 


that  does  not  strictly  depend  on  high  structure  assumptions.  In  fact  it 
may  be  robust  for  a  variety  of  high  structure  assumptions  which  result  in 
a  predictive  expectation  approximately  equal  to  (5.1l)*  Actually  if  one 
did  not  use  any  high  structure  hint  for  a  predictive  function  for  this 
problem  but  merely  used  a  convex  combination  of  sample  mean  and  prior  guess 

=  t#  X  +  (l-ce5^)  0  S  ^  1,  (5.15) 

A  . 

then  the  result  for  was  already  obtained  in  section  4  as 


N 


X 


N+1 


t^  *!  (N-1) 


-1 


=  g 


if  t^  >  1 


if  t^  ^  1 


(5.16) 


This  may  be  contrasted  with  (5*14)  when  the  value  for  a  is  inserted 
which  turns  out  to  be 


X 


_  X  g 


N+1  t‘ 

=  g 


>  1 
^  1. 


(5.17) 
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The  predictor  in  (5.I7)  is  weighted  slightly  more  towards  x  than 

(5*16)5  but  in  fact  they  are  asymptotically  equivalent  to  order  N 
In  any  practical  example  there  would  probably  not  be  much  to  choose 
between  them. 

It  is  also  to  be  noted  that  the  intermediate  structures  are  difficult 
or  impossible  to  apply  in  situations  such  as  this  one  where  there  may  be 
some  prior  information  that  should  be  taken  into  account. 

6 .  REMARKS .  A  somewhat  abbreviated  exposition  of  the  predictivistic 
view  has  been  presented.  This  view  is  not  a  mode  of  inference  as  such  but 
can  be  implemented  from  a  variety  of  inferential  modes.  It  stems  from  the 
attitude  that  inferences  should  be  restricted  to  potentially  observable 
entities  unless  compelling  reasons  to  contrary  exist.  In  conformance  with 
this  view  we  have  presented  various  ways,  arising  from  different  stand¬ 
points,  of  implementing  the  predictive  approach.  In  particular  a  recently 
developed  low  structure  approach  PSR  has  also  been  delineated  in  some¬ 
what  greater  detail,  which  should  be  of  great  value  in  many  situations  and 
need  be  added,  we  believe,  to  the  toolkit  of  every  statistician. 
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VARIOUS  METHODOLOGICAL  APPROACHES  TO  PEER  EVALUATIONS 

Ronald  G.  Downey  and  Paul  Duffy 

U.S.Army  Research  Institute . for  the  Behavioral  And 

Social  Sciences 

Arlington,  Virginia 

When  confronted  with  the  prospect  of  drawing  order  out  of  complex 
human  behavior  in  the  equally  complex  world  of  work,  two  primary  charac¬ 
teristics  have  marked  traditional  behavioral  science  research.  First, 
heavy  reliance  has  been  placed  upon  human  evaluations  or  ratings  of 
other  humans.  Secondly,  these  performance  or  trait  ratings  have  been 
predominantly  gathered  from  a  limited  observational  viexTpoint,  namely 
the  supervisor.  The  technique  outlined  in  the  present  paper  does  not 
deviate  from  the  first  of  these  characteristics;  it  does  rely  on  human 
evaluation  of  other  humans.  However,  it  goes  beyond  the  second  charac— 
eristic  by  gathering  such  evaluative  information  from  the  additional 
perspective  of  an  individual’s  peers.  For  purposes  of  the  present 
paper,  peers  are  operationally  defined  by  their  sharing  of  some  common 
purpose  (e.g.,  members  of  the  same  work  group),  and  generally  by  the 
lack  of  a  formally  recognized  authority  relationship  between  them.  The 
term  associate  will  be  used  interchangeably  with  peer. 

The  history  of  peer  evaluations  can  be  traced  back  to  post  VJorld 
War  II  work  by  Williams  and  Leavitt  (1947).^  The  history  of  the  techni¬ 
que  can  be  traced  back  even  further  to  the  original  work  of  Moreno  (1934) 
and  his  development  of  the  sociogram  technique.  Since  that  time,  peer 
evaluations  have  been  used  for  two  primary  purposes.  The  first  of 

^Some  research  efforts  were  reported  before  this,  during  and  just  after 
World  War  II.  See,  for  example,  Clarke  (1946),  U.S.  Army  Research 
Institute  (Note  1),  ahd  U.S.  Army  Research  Institute  (Note  2)  where  peer 
evaluations  were  used  as  criteria  for  leadership  studies. 
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these  purposes  is  evaluative  in  the  criterion  sense  (i.e.,  leadership 
effectiveness,  job  performance,  etc.)*  The  second  purpose  is  evaluative 
in  the  sense  of  predicting  future  behavior  or  success  (i.e.,  motivation 
to  work,  goal  orientation,  potential,  etc.).  Lindzey  and  Byrne  (1968) 
have  presented  an  excellent  review  of  the  use  of  social  choice  method¬ 
ology  of  which  peer  evaluations  are  one  type.  More  specialized  reviews 
of  the  work  are:  Gibb  (1961),  Gibb  (1969),  Hollander  (1954),  Boulger 
and  Colmen  (Note  3) ,  and  Nadal  (Note  4) . 

Aside  from  considerations  about  the  use  of  peer  evaluations, 
another  major  issue  centers  on  what  the  dimension  is  which  peers  are 
evaluating.  For  instance,  previous  research  has  been  directed  at  peer 
evaluations  of  leadership  (Hollander,  1965),  personality  traits  (Tupes 
and  Christal,  Note  5),  and  supervisor  skills  (Weitz,  1958)  to  name  but  a 
few  of  the  dimensions  which  have  been  investigated.  While  we  will  not 
directly  address  the  issue  of  which  dimension  is  measured,  it  is 
probably  the  single  most  important  decision  the  researcher  makes  in 
the  design  of  the  experiment. 

Given  this  short  background  we  will  address  two  major  areas  which 
relate  to  the  development  of  a  peer  evaluation  system;  first,  method¬ 
ological  considerations  and  second,  situational  factors  which  could 
impact  upon  the  evaluative  process. 

To  facilitate  understanding  of  the  methodological  issues,  they  will 
be  described  in  terms  of  effects  upon  the  major  scaling  techniques 
available,  of  which  there  are  four:  ratings,  rankings,  full  nominations 
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and  high  nominations.  A  summary  of  the  following  discussion  is  provided 
in  Figifre  1. 

Methodological  Issues 

The  general  paradigm  of  the  rating  technique  calls  for  a  group 
member  to  provide  a  rating  of  the  relative  amount  or  degree  of  the 
dimension  under  consideration  possessed  by  every  other  group  member. 

The  ranking  procedure  simply  requires  each  group  member  to  rank  order 
every  other  group  member  from  high  to  low  (or  some  other  relevant 
continuum)  on  the  dimension  under  consideratio,n.  The  full  nomination 
technique  requires  that  each  group  member  choose  a  specified  number  or 
proportion  of  the  group  as  being  either  high,  medium,  or  low  on 
the  dimension.  In  the  present .paper ,  the  minor  variation  of  this 
technique  whenever  middle  or  medium  nominations  are  not  required 
will  also  be  referred  to  as  full  nominations.  However,  the  case  where 
only  high  nominations  are  elicited  is  reserved  as  a  discriminably  different 
technique  for  reasons  to  be  elaborated  in  later  portions  of  the  paper. 
Several  variations  based  on  combinations  of  these  basic  techniques  are 
forced  distribution  rankings  or  combinations  of  rankings  and  ratings  or 
nominations.  General  scoring  algorithms  for  the  four  primary  techniques 
are  presented  below: 

Ratings 

Score  = 

N 

Rankings 

Score  =  ^  ^ 
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Full  Nominations 


Score  ^ 

N 

where  rj^^.  =  Rating 

rRk  ~  Ranking 

Tt  =  Low  nomination 

Li 

=  Mid  (or  no)  nomination 

Ttr  ®  High  nomination 

n 

N  =  Number  giving  an  evaluation 

N.p  =  Total  number  in  the  group 
By  inspection,  several  characteristics  of  these  formulae  should 
be  noted.  All  of  these  techniques  produce  scores  which  are,  in 
general,  independent  of  group  size  with  the  exception  of  the  rank¬ 
ing  formula  in  which  case  adjustment  must  be  made  for  group  sizes 
greater  than  100.  It  can  also  be  seen  that  the  average  score  for  a 
group  using  either  a  ranking  or  nomination  technique  is  determined, 
the  average  score  for  the  rating  technique  is  free  to  vary. 

Metric  and  Distribution 

The  metric  and  distributional  properties  of  associate  evalua- 
tions  are  directly  related  to  the  particular  technique  employed. 

With  respect  to  the  scaling  properties  of  the  various  techniques,  the 
rankings  and  both  nominations  from  an  evaluator  are  ordinal  data 
(Stevens,  1951).  The  ratings  from  an  evaluator  are  the  most  nearly 
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Interval  data  although  here  also  it  can  be  argued  that  these  are 
merely  ordinal  data.  The  scaling  properties  of  the  summated  scores 
from  the  various  techniques  approximate  interval  data  as  the  number 
in  the  evaluation  group  increases. 

In  addition,  the  4  most  common  procedures  will  commonly  produce 
different  distributions,  examples  of  which  are  displayed  in  Figure  2, 
Given  the  free  response  mode  for  ratings,  they  will  often  produce 
negatively  skewed  distributions  due  largely  to  group  norms  to  inflate 
any  evaluative  procedure.  The  ranking  procedure,  if  it  were  perfect- 
ly  reliable,  would  produce  a  rectangular  distribution  with  one  person 
at  each  rank.  Generally,  less  reliable  rank  scores  will  tend  to  be 
normally  distributed  with  even  less  reliable  scores  producing  a  more 
leptokurtic  curve,  and  a  perfectly  unreliable  test  producing  a  point 
distribution  with  everyone  receiving  an  average  rank  equal  to  the 
middle  rank.  Full  nomination  scores  produce  a  distribution  which, 
if  perfectly  reliable,  is  tri-modal  with  one  group  receiving  all 
high  nominations,  a  group  with  all  low  nominations  and  the  remainder 
having  middle  nominations  or  none  at  all.  High  nominations  only  pro¬ 
duce  a  bi-modal  distribution  (not  shown  in  Figure  2). 

Basis  of  Comparison 

Scores  which  result  from  the  four  primary  techniques  vary  along 
another  important  dimension;  that  is,  the  internal  process  evoked 
in  the  evaluator  upon  which  he  makes  his  judgement.  In  one  case,  the 
evaluatpr  compares  the  particular  individual  against  some  external 
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(to  the  group)  frame  of  reference  and  assigns  him  to  soine  category* 

In  the  second  case,  the  evaluator  compares  the  particular  indivi¬ 
dual  against  some  internal  (to  the  group)  frame  of  reference  krid 
makes  a  judgement  df  more  or  less  and  assigns  him  to  the  apt)ropriate 
category.  The  external  process  can  only  be  used  with  the  rating 
procedure.  The  internal  process  can  be  used  with  the  ratings,  but 
it  must  be  used  with  rankings  and  nominations.  It  should  be  noted 
that  the  internal  process,  in  general,  requires  a  moderate  number  of 
ii^^ividuals  in  the  group  (more  than  5)  •  The  direct  implication  of 
this  distinction  is  that  the  external  frame  of  reference  allows  both 
comparison  between  individuals  across  peer  groups  and  the  comparison 
of  peer  groups.  The  internal  process  does  not  allow  comparison 
between  individuals  across  peer  groups  unless  the  assumption  is 
accepted  that  the  groups  are  equal  on  the  particular  ability,  trait 
or  behavior. 

The  corollary  of  this  implication  is  that  population  norms 
can  be  developed  only  through  the  use  of  a  rating  procedure  and  an 
external  frame  of  reference. 

Reliability 

The  reliability  of  associate  evaluations  has  generally  been 
determined  by  one  of  two  methods,  internal  consistency  or  test-retest. 
Both  methods  are  analogous  to  the  same  procedures  in  classical  test 
theory  (Lord  and  Novick,  1968). 

The  internal  consistency  of  peer  evaluations  is  the  degree  to 
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which  members  of  a  peer  group  agree  with  one  another  when  observing 
an  individual  in  a  similar  situation  and  at  the  same  time.  Using  the 
multiple  choice  test  paradigm,  the  evaluators  are  comparable  to  the 
test  items  and  those  who  are  being  evaluated  are  comparable  to  the 
people  taking  the  test.  While  Gordon  (1969)  has  recommended  the  use 
of  the  alpha  coefficient  for  estimating  the  Internal  consistency  or 
reliability  of  peer  evaluations,  the  most  common  procedure  has  been 
a  split-half  (or  group)  estimate.  The  split-half  estimate  is  made 
by  computing  scores  for  all  group  members,  randomly  assigning  peer 
group  members  to  one  of  two  groups,  and  then  correlating  the  scores  for 
each  ratee  from  each  group  (See  Hollander,  1957;  and  Downey,  Note  6). 
The  correlation  is  then  adjusted  for  the  total  group  size  using  the 
Spearman-Brown  formula  (Gulliksen,  1950).  If  smUll  groups  are  used, 
a  random  split  may  not  be  possible  and  some  technique  for  averaging 
the  intercorrelat ions  between  evaluators  could  be  used  (Gulliksen, 
1950). 

The  test-retest  ftiethod  of  estimating  reliability  requires  that 
group  members  evaluate  each  other  at  two  different  times.  Scores 
from  the  two  different  evaluations  are  then  correlated.  Examples  of 
this  type  of  estimate  are  given  in  Hollander  (1957) ,  Downey  (Note  6), 
and  Downey  (Note  7).  Perhaps  the  most  rigorous  examination  of  relia¬ 
bility  was  done  by  Gordon  and  Medland  (1965)  where  they  varied  both 

time  and  group  doing  the  evaluations  and  found  reliabilities  in  the 
80's. 
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Research  has  generally  found  the  reliability  of  peer  evaluations 
to  be  in  the  .70  to  .90  range,  regardless  of  the  reliability  estimate 
employed.  Research  which  has  compared  the  various  evaluative  method-- 
ologies  is  rare,  but,  in  general,  has  supported  the  view  that  all  four 
methods  are  quite  similar  with  maybe  a  slight  advantage  to  ratings 
(See  Suci,  Vallance,  and  Glickman,  Note  8;  Downey,  Note  6;  and  Hammer, 
Note  9) .  Even  the  use  of  a  paired  comparison  procedure  does  not 
significantly  improve  reliability  (Bolton,  Note  10),  The  selection  of 
a  particular  technique  will  rarely  be  decided  by  differences  in 
reliability  between  the  techniques. 

Acceptability 

A  major  factor  in  the  success  or  failure  of  a  particular  research 
program  is  the  degree  of  involvement  and  commitment  to  the  program 
on  the  part  of  the  participants,  in  other  words,  acceptability. 
Acceptability  is  generally  studied  as  a  specific  issue  of  the  particu¬ 
lar  program  under  investigation  rather  than  comparative  analyses  of 
acceptability  across  techniques  or  situations.  There  is,  therefore, 
little  formal  evidence  of  differences  between  techniques  but  many 
inferences  can  be  drawn  based  upon  the  particular  qualities  of  the 
technique.  A  major  factor  in  the  acceptability  of  a  technique  is  the 
degree  of  perceived  difficulty.  From  this  point  of  view,  both  the 
rating  and  ranking  of  large  numbers  of  people  (greater  than  20)  can 
be  time  consuming  and  makes  for  difficult  discriminations  among  the 
average  members  of  the  group.  On  the  other  hand,  the  nomination 
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procedure  allows  the  individual  to  place  a  large  number  of  people 
in  a  desired  category  and  does  not  force  him  to  make  difficult  discrimi¬ 
nations. 

The  rating  procedure  is  quite  acceptable  where  the  group  is  small 
and  cohesive.  The  full  nomination  technique  is  acceptable  for  moder¬ 
ate  to  large  size  groups  where  not  all  individuals  are  well  known  to 
one  another.  The  high  nomination  technique  is  even  more  acceptable 
because  it  does  not  require  an  individual  to  make  negative  evaluations. 

A  major  determinant  of  the  degree  of  acceptability  is  the  degree 
to  which  group  members  are  Imowledgeable  about  the  evaluation  procedure, 
process,  background  and  use.  Downey  (Note  11)  found  that  accept¬ 
ability  improved  as  a  function  of  an  educational  program.  Two  differ¬ 
ent  types  of  attitudes  were  found;  first,  the  degree  to  which  peer 
evaluations  were  felt  to  be  valuable  and  accurate  estimates  and, 
second,  the  degree  to  which  they  were  acceptable  for  particular  uses. 
Downey  also  found  that  the  peer  evaluations  and  acceptance  were 
positively  related,  with  larger  relationships  being  found  in  the  group 
with  less  information  on  the  peer  evaluation  process. 

Feasibility 

Closely  linked  with  the  previous  concept  of  acceptability  is 
feasibility,  or  costs  associated  with  the  implementation  and  execution 
of  a  particular  peer  evaluation  system.  The  major  costs  associated 
with  a  peer  evaluation  system  are:  (1)  preparation  of  evaluation 
materials,  (2)  administration  time,  and  (3)  scoring  cost.  Prior  to 
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the  advent  of  automatic  data  processing  procedures,  the  costs  associ¬ 
ated  with  any  peer  evaluation  system  with  large  groups  or  on  a  large 
scale  were  prohibitive.'  Merely  in  terms  of  bits  of  information 
collected,  it  can  be  seen  that  the  number  of  evaluations  is  equal  to 
where  N  is  the  number  in  the  group.  Figure  3  presents  the  compara¬ 
tive  costs  associated  with  each  of  the  four  techniques. 

As  can  be  seen  from  Figure  3,  each  of  the  4  techniques  incur 
equally  high  costs  associated  with  the  preparation  of  a  list  of  the 
peers.  It  is  important  that  all  evaluators  be  provided  with  a  full 
list  of  all  other  members  of  the  peer  group.  The  administration  time 
for  the  full  nomination  technique  is  low  due  to  the  small  number  of 
decisions  associated  with  making  the  low  and/or  high  choices.  An 
excessive  amount  of  time  is  devoted  to  making  fine  discriminations 
in  the  ranking  procedure,  whereas  a  moderate  amount  of  time  is  taken 
up  by  the  rating  of  every  individual. 

The  scoring  of  the  peer  evaluations  normally  requires  access  to 
some  sort  of  automatic  data  processing  facility  in  all  but  the  small¬ 
est  scale  operations.  The  actual  computer  cost  is  virtually  equal 
for  all  techniques,  but  they  can  differ  substantially  in  terms  of  the 
costs  associated  with  getting  the  evaluations  into  a  data  processable 
form.  Costs  vary  by  technique  as  a  function  of  using  either  keypunch¬ 
ing  or  optical  scanning.  Both  the  full  and  high  nomination  techniques 
involve  low  cost  and  ratings  also  have  low  costs  associated  with 
optical  scanning.  Rankings  produce  high  costs  in  both  keypunching 
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and  optical  scanning  and  ratings  have  high  costs  associated  with 

Generally,  nominations  produce  the  lowest  costs  overall 
followed  by  ratings  with  rankings  having  the  highest  costs  overall. 

It  should  be  noted  that  peer  evaluation  systems  are  relatively  costly 
efforts  which  typically  require  more  than  minimal  sophistication 
with  data  processing  procedures. 

Situational  Factors 

In  addition  to  the  methodological  concerns  of  the  various  techni¬ 
ques  presented  in  the  previous  section,  there  are  also  a  variety  of 
situational  or  contextual  factors  which  can  impact  upon  a  peer  evalua¬ 
tion  system,  often  regardless  of  the  specific  technique .under  discus¬ 
sion.  Among  these  factors  are  group  size,  informal  group  structures, 
demographic  characteristics,  group  boundaries,  hierarchical  character¬ 
istics,  friendships,  length  of  association  and  type  of  interaction. 
Each  of  these  factors  will  be  discussed  in  turn  and,  where  appropriate, 
specific  mention  will  be  made  of  their  effect  upon  the  various 
techniques. 

Size 

Very  few  attempts  have  been  made  to  study  the  independent  effects 
of  group  size.  More  often  than  not,  what  evidence  there  is  for  the 
effects  of  group  size  has  been  reported  as  a  byproduct  in  studies 
directed  at  some  other  purpose.  For  example,  Downey,  Medland,  and 
Yates  (Note  12),  used  a  peer  nomination  technique  with  groups  of 
Army  Colonels  in  14  career  groups  which  varied  in  size  from  22  to  321. 


373 


Reliabilities  varied  from  .63  to  ,94  and  the  rank  order  correlation 
between  group  size  and  reliability  was  .03.  Downey  (Note  7) ,  in 
a  sample  of  Army  Ranger^,  compared  peer  ratings  collected  within 
squads  (n  =  10)  with  peer  nominations  collected  on  the  same  men 
within  platoons  (n  =  40) .  Correlations  between  the  two  scores  were 
in  the  .60*s.  However,  there  were  indications  that  the  platoon 
scores  were  both  more  reliable  and  more  predictive  of  job  performance. 

As  mentioned  previously,  from  the  standpoint  of  feasibility, 
both  ratings  and  rankings  would  seem  to  be  most  appropriate  for 
relatively  small  group  sizes  (i.e.,  approximately  a  dozen),  while 
the  nomination  technique  is  virtually  mandatory  for  large  group 
sizes  (i.e.,  greater  than  50).  From  the  standpoint  of  empirical 
results,  it  appears  that  small  groups  v.dy  produce  unreliable  scores 
with  reduced  validity.  Alternatively,  while  It  is  rational  to  believe 
that  there  is  an  optimal  upper  size  peer  group,  there  is  scant 
evidence  to  support  this  view. 

Informal  Group  Structures 

Given  the  well  documented  fact  that  within  any  formally  defined 
group  there  may  exist  one  or  more  informal  subgroups  defined  by  some 
sort  of  mutual  self  interest,  the  issue  arises  as  to  what  effect  these 
informal  subgroups  may  have  on  a  peer  evaluation  procedure  conducted 
in  the  total  group  for  a  purpose  other  than  finding  what  subgroup 
structure  exists. 

For  example,  the  worst  case  would  be  one  in  which  two  equal- 
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sized  informal  subgroups  existed  within  a  total  group  and  included 
each  group  member  exclusively  in  one  or  the  other.  In  such  a  situa¬ 
tion,  it  can  be  assumed  that  one  or  both  subgroiips  might  irtake  their 
evaluations  solely  on  the  basis  of  subgroup  membership,  i.e.,  on  a’ 
basis  other  than  the  one  intended.  The  net  effect  of  such  behavior 
is  to  atteiiuate  the  validity  of  the  peer  evaluation  procedure,  and 
it  is  most  pronounced  when  both  subgroups  engage  in  siich  behavior. 

The  effect  diminishes  if  one  of  the  groups  does,  in  fact,  provide 
evaluations  on  the  dimension  intended.  The  effect  also  diminishes 
as  informal  subgroup  size  decreases  or  as  the  number  of  subgroups 
increases. 

In  terms  of  technique,  the  effect  of  subgroup  behavior  will  be 
pronounced  if  ratings  or  rankings  are  used  with  resultant  scores  ' 
most  likely  to  be  negatively  skewed.  The  use  of  full  nominations 
will  tend  to  produce  scores  with  decreased  variance,  and  high  nomina-  * 
tions  will  produce  the  worst  case  with  a  drastic  reduction  in  variance. 

It,  is  clear  that  subgroups  of  sufficient  size  can  have  an  effect 
upon  the  final  scores,  and  therefore  the  question  is  the  incidence 
of  such  effects  and  whether  there  exists  a  mechanism  for  detecting 
its  occurrence.  The  simplest  procedure  for  checking  for  these  problems 
is  the  repetitive  production  of  reliability  indices  as  part  of  the 
procedure  for  producing  peer  scores.  If  the  reliability  coefficients 
were  to  drop  below  .60,  it  would  seem  to  indicate  a  problem  and  Care 
should  be  taken  in  use  of  the  evaluations.  If  the  evaluation  process 
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is  part  of  an  ongoing  process,  then  the  use  of  a  two-way  analysis 
of  variance  design  with  one  factor  being  the  types  of  raters  and 
the  other  factor  being  the  same  types  of  ratees  should  be  used. 

If  a  significant  interaction  were  found,  then  a  strong  case  could 
be  made  for  peer  scores  being  at  least  partially  the  result  of  group 
membership. 

Demographic  Characteristics 

The  use  of  peer  evaluations  with  their  reliance  upon  fallible 
human  observers  immediately  raises  the  possibility  of  racial  and  sexual 
bias  on  the  part  of  evaluators.  This  concern  is  especially  crucial 
in  view  of  recent  problems  associated  with  demonstrating  the  absence 
of  bias  in  employment  selection  and  classification  measures  as  well 
as  criterion  measures. 

The  evidence  concerning  racial  bias  in  peer  evaluations  is  mixed 
and  inconclusive.  In  a  study  dealing  with  Air  Force  recruits,  Cox 
and  Krumboltz  (1958)  found  that  subjects  were  rated  higher  by  members 
of  their  own  race,  but  the  effect  varied  across  groups  and  there 
was  substantial  agreement  on  rank  order  across  races  (_r  =  .76). 

They  conclude  that  the  bias  which  might  exist  is  far  from  complete 
and  suggest  that  prior  acquaintanceship  of  group  members  may  account 
for  the  differences.  In  a  similar  study  in  the  Army,  deJung  and 
Kaplan  (1962)  found  similar  results  with  ratings  differing  as  a 
function  of  the  rater’s  race.  However,  an  analysis  of  covariance 
adjusting  for  a  combined  interest  and  math  score  showed  that  whites 
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did  not  give  higher  adjusted  scores  to  whites  or  blacks,  but  blacks 
did  give  higher  adjusted  scores  to  blacks*  Results  were  interpreted 
in  terms  of  assignment  of  higher  scores  to  close  acquaintances  which 
had  more  of  an  impact  upon  blacks  rating  blacks  due  to  the  smaller 
group  size. 

In  a  more  recent  study  in  an  industrial  training  context,  Schmidt 
and  Johnson  (1971)  used  a  forced  choice  rating  distribution  in  groups 
with  approximately  equal  numbers  of  blacks  and  whites.  No  differences 
due  to  race  were  found. 

The  evidence  suggests  that  peer  evaluations  can  be  subject  to 
racial  bias,  but  the  effect  is  perhaps  more  strongly  related  to  the 
interaction  between  friendship  or  acquaintanceship  and  the  particular 
evaluation  method  used.  The  presence  of  substantial  correlations 
between  the  rank  orderings  from  each  rac^  indicates  that  a  similar 
view  prevailed.  But,  the  use  of  ratings  allows  evaluators  to  assign 
unrelated  scores  to  individuals  whom  they  consider  special  in  some 
way. 

In  terms  of  sexual  bias,  Mohr  and  Downey  (Note  13)  recently 
reported  results  from  a  small  sample  of  Army  officers  which  indicated 
that  females  scored  lower  than  males  on  scores  received  from  both 
males  and  females.  If  bias  occurred,  it  was  on  the  part  of  both 
groups.  An  interesting  finding  was  that  females’  self-ratings  were 
not  related  to  either  male  or  female  evaluations  but  males’  self- 
ratings  were  related  to  these  evaluations. 
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This  admittedly  small  number  of  studies  appears  to  indicate  that 
differences  based  upon  race  and  sex  can  occur,  but  it  is  unclear  whether 
these  differences  are  attributable  to  race  or  sex  group  differences, 
to  interaction  patterns  (i,e.,  friendships,  etc.)j  to  the  specific 
methodology,  or  some  combination  of  all  of  these  factors.  It  would 
certainly  be  safe  to  say  that  researchers  should  be  sensitive  to  the 
potential  for  such  bias. 

Group  Boundaries 

The  discussion  of  peer  evaluations  has  proceeded  to  this  point 
as  if  it  were  clear  just  what  is  meant  by  a  peer  or  associate  group. 

Most  researchers  report  their  procedures  in  sufficient  detail  to  show 
the  general  characteristics  of  the  groups  which  were,  in  fact,  used. 
However,  given  that  there  are  a  variety  of  overlapping  and  higher 
order  groups  in  most  real-life  settings,  the  issue  becomes  that  of 
defining^  some  basic  guidelines  for  selecting  the  appropriate  rating 
group.  It  is  clear  that  the  selection  of  the  evaluative  group  can  be 
effected  by  such  factors  as  the  length  and  type  of  interaction, 
formal  organizational  structure,  informal  group  structure,  friendship 
patterns  and,  of  course,  the  particular  dimension  being  evaluated. 

As  has  been  the  case  for  several  of  the  preceding  issues,  there 
is  little  empirical  data  to  guide  the  selection  of  the  group.  Rather, 
guidelines  must  be  best  guesses  based  on  partial  information  from 
related  data. 

In  the  previously  mentioned  study  by  Downey  (Note  7) ,  it  was 


378 


found  that  platoon  evaluations  produced  more  reliable  and  slightly 
more  valid  scores  than  squad  evaluations,  but  the  differences  were 
potentially  confounded  by  differences  between  both  method  and  size. 

A  study  by  Gordon  and  Medland  (1965) ,  in  which  individuals  were 
evaluated  at  two  different  times  by  totally  different  groups  of 
different  structure,  indicated  a  high  degree  of  stability  across  the 
two  evaluations.  Even  the  method  which  was  used  to  compute  reliability 
indices,  random  splits  of  the  primary  group,  supports  the  notion  that 
group  composition  can  be  drastically  altered  without  major  problems 
arising  in  producing  reliable  and  valid  scores. 

A  concept  related  to  that  of  group  boundaries  is  that  of  hierarch¬ 
ies.  For  example,  an  Army  platoon  is  made  up  of  4  squads,  each  headed 
by  a  squad  leader.  If  the  platoon  is  chosen  as  the  peer  group,  tl^e 

•  I 

issue  is  whether  the  squad  leaders  should  be  included  in  the  process. 
Folklore  holds  that  the  inclusion  of  such  individuals  will  often  work 
to  their  disadvantage,  and  therefore  they  should  be  excluded  from  the 
platoon  peer  group  and  included  in  a  peer  group  of  squad  leaders. 

Research  by  Levi,  Torrance,  and  Pletts  (1958)  indicated  no  effects 
from  including  the  formal  leader  in  the  peer  evaluation  process. 

Research  by  Downey  (Note  14),  in  which  the  leaders  of  small  combat 

( 

units  were  included  in  the  peer  nomination  process,  indicated  that 
the  leaders  spanned  the  full  range  of  leadership  potential  scores. 

And,  rather  than  being  penalized,  there  was  a  positive  relationship 
between  formal  position  and  peer  evaluation  scores  (as  there  should 
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be  if  the  selection  procedure  for  leaders  had  any  validity  originally) . 

It  should  be  pointed  out  that  these  data  were  experimental  and 
the  Introduction  of  an  operational  system  may  change  the  situation 
depending  upon  the  use  to  which  the  resulting  evaluations  will  be  put. 

A  rational  solution  to  the  problem  should  be  guided  by  the 
following  suggestions: 

(1)  Select  the  group  to  have  sufficient  size  to  overcome  problems 
associated  with  primary  groups. 

(2)  Group  size  should  not  be  so  large  as  to  produce  subgroups 
which  may  be  relatively  unknown  to  each  other  or  be  competing  for 
similar  resources  and  rewards, 

(3)  Groups  selected  should  be  somehow  reasonably  related  to  the 
dimension  to  be  evaluated,  e.g.,  if  evaluation  of  leadership  in  a  work 

f 

setting  is  desired,  select  a  work  group  and  not  a  social  group. 
Friendship 

Friendship  has  been  a  major  research  issue  in  the  history  of 
peer  evaluations.  This  is  another  case  where  folklore  has  stated 
that  peer  evaluations  are  the  product  of  friendship  or  popularity  and 
are  therefore  not  valid  indications  of  the  dimension  under  considera¬ 
tion.  The  impact  of  this  bit  of  folklore  has  been  that,  with  the 
exception  of  simple  validity  studies,  this  is  probably  the  single 
most  researched  question  associated  with  peer  evaluations. 

Wherry  and  Fryer  (1949)  were  the  first  to  address  the  issue. 

They  reported  that  although  there  was  a  moderate  degree  of  relation- 
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ship  between  friendship  and  a  leadership  criterion,  the  major  portion 
of  the  predicted  criterion  variance  was  independent  of  friendship. 

They  concluded  that  peer  evaluations  of  leadership  are  not  popularity 
contests.  Studies  by  Gibb  (1950)  and  Horrocks  and  Wear  (1953)  in 
college  samples  support  Wherry  and  Fryer’s  findings.  Borgatta  (1954) 
also  reported  that  leadership  and  popularity  evaluations  were  related, 
but  he  failed  to  draw  any  conclusions.  Several  other  studies  have 
documented  a  moderate  degree  of  relationship  between  friendship  and 
peer  evaluations  of  leadership  Hollander,  1956;  Hollander  and  Webb, 
1955;  Theodorson,  1957). 

Downey  (Note  6)  recently  presented  evidence  that  the  use  of  full 
nominations  (with  small  numbers  of  high  and  low  nominations  required) 
reduced  the  correlation  between  friendship  and  leadership  evaluations 
compared  with  forced  distribution  ratings. 

It  would  seem  that  when  an  evaluator  is  faced  with  a  choice  of 
how  to  evaluate  a  friend,  he  will  tend  to  select  a  friend  rather  than 
another  person  he  considers  of  equal,  or  at  least  indistinguishable, 
merit.  Therefore,  the  variance  associated  with  friendship  may  be  a 
source  of  systematic  error  primarily  in  the  middle  of  the  distribution. 
This  systematic  error  variance  will  increase  in  large  groups  where 
some  members  are  relatively  unknown  to  each  other  or  the  interaction 
patterns  are  not  fully  established  for  all  members. 

Even  in  the  face  of  the  impressive  research  findings  demonstrating 
the  invalidity  of  the  "popularity  contest"  issue,  this  remains  as  the 
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most  consistent  argument  against  the  use  of  peer  evaluations  in  an 
operational  setting.  A  corrollary  of  this  objection  is  the  feeling 
that  peer  evaluators  do  not  make  the  right  choice,  the  best  counter¬ 
argument  to  which  is  the  impressive  list  of  validity  studies  on  peer 
evaluations. 

Length  of  Association 

When  peer  evaluations  are  considered  for  use  in  any  situation, 
an  important  question  concerns  how  long  group  members  must  have  been 
in  contact  with  each  other  before  reliable  and  valid  evaluations  can 
be  provided.  For  example,  this  issue  is  often  raised  in  the  context 
of  transient  training  groups.  Research  is  fairly  consistent  in  find¬ 
ing  that  peers  can  make  reliable  and  valid  evaluations  after  a  relatively 
short  period  of  time  (typically  3  to  6  weeks) . 

Subsidary  to  the  overall  issue  is  the  question  of  the  effect  of 
including  a  new  group  member  in  an  intact  group.  Mayfield  (Note  14) 
has  suggested  that  in  such  a  situation  there  may  be  reason  to  suspect 
that  a  longer  period  of  acquaintanceship  is  necessary  for  sufficient 
integration  into  the  group  to  occur.  A  more  generalized  way  of 
approaching  the  question  is  the  extent  to  which  a  person  is  known  or 
not  kno\>m  to  other  members  of  the  group.  Evidence  has  shown  that 
an  individual  not  well  known  to  other  members  of  the  group  will 
typically  be  evaluated  as  lying  near  the  middle  of  the  distribution 
within  the  group  (Downey,  Note  6). 

In  terms  of  technique,  a  nomination  procedure  is  most  likely  to 
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decrease  the  error  variance  associated  with  acquaintanceship  while 
ratings  or  rankings  would  tend  to  capitalize  on  the  error  variance 
and  show  a  greater  degree  of  relationship  with  such  measures. 

Type  of  Interaction 

While  the  use  of  peer  evaluations  has  been  extensive  over  a  span 
of  more  than  twenty-five  years,  they  have  nevertheless  been  applied  in 
rather  limited  situations.  In  fact,  the  majority  of  the  research  has 
been  conducted  with  junior  personnel  in  a  military  training  context . 
Recent  work  outside  the  military  by  Weitz  (1958)  and  subsequent  follow¬ 
ups  by  Mayfield  (1970;  Note  15)  has  been  conducted  in  industry  with 
insurance  salesmen.  There  has  also  been  a  recent  effort  to  use  a 
peer  nomination  process  in  a  senior  Army  officer  promotion  system 
which  produced  supportive  results  (Downey,  Medland,  and  Yates,  Note  12) 
But,  until  more  extensive  research  is  conducted  in  broader  organiza¬ 
tional  contexts  with  a  wider  selection  of  subject  populations,  the 
generality  of  the  peer  evaluation  process  is  largely  a  matter  of 
conjecture. 

A  related  issue  is  the  type  of  interaction  required  to  produce 
valid  evaluation,  Freeberg  (1969)  reported  a  study  in  which  peer 
evaluations  were  more  highly  related  to  a  performance  criterion  when 
the  interaction  between  peers  was  relevant  to  the  dimension  being 
evaluated.  Bayroff  and  Machlin  (Note  16)  found  that  leadership 
evaluations  could  be  made  in  an  academic  environment  and  were  highly 


related  to  evaluations  done  after  exposure  to  a  situation  where 


leadership  was  displayed,  Lewin,  Dubno,  and  Akula  (1971)  indicated 
that  video  tapes  supplied  sufficient  information  for  reliable  evalua¬ 
tions  and  were  highly  related  to  evaluations  from  group  members. 

It  would  be  safe  to  assume  that  peer  evaluations  of  a  variety 
of  complex  human  behaviors  can  be  rendered  reliably  after  exposure  of 
the  peers  to  each  other  in  situations  which  require  the  individual  to 
interact  either  with  the  environment  or  with  other  people  in  work 
oriented  or  socially  oriented  situations.  Further,  it  can  be  surmised 
that  the  validity  of  the  evaluations  will  be  a  function  of  the  degree 
to  which  the  particular  behaviors  are  relevant  to  the  dimension  under 
study.  Hollander  (1956)  found  that  reliable  evaluations  were  given 
after  one  hour  of  discussion  between  peers  in  a  Naval  OCS  class,  but 
that  they  had  only  a  moderate  degree  of  relationship  with  evaluations 
after  3  weeks,  and  were  not  as  predictive  of  eventual  job  performance. 
This  convergence  of  views  by  peers  after  a  short  period  of  exposure 
is  probably  a  function  of  similar  psychological  maps  of  behavior  on 
the  part  of  peers,  and  the  preliminary  evaluations  on  limited  informa¬ 
tion  are  subject  to  revision  based  upon  further  information.  There 
would  seem  to  be  little  advantage  of  one  evaluative  technique  over 
another  as  long  as  the  technique  does  not  require  the  evaluator  to 
make  finer  discriminations  than  are  possible  based  on  the  type  of 


interaction. 


Summary 

The  peer  evaluation  technique  has  been  used  by  researchers  both 
as  a  criterion  of  complex  human  behavior  and  as  an  index  of  future 
potential.  In  either  case,  the  particular  dimension  measured  has 
varied  considerably.  The  present  paper  reviewed  the  psychometric 
properties  and  related  research  findings  of  the  four  major  techniques 
(ratings,  rankings,  full  nominations  and  high  nominations).  Several 
important  similarities  and  differences  were  indicated.  For  example, 
only  ratings  can  produce  comparable  scores  across  different  groups 
without  extensive  assumptions.  In  addition,  results  of  research  indicate 
little  differences  in  measurement  reliability  between  techniques.  The 
limited  findings  also  indicate  that,  in  general,  ratings  and  rankings 
are  less  acceptable  and  less  feasible  than  either  of  the  noniination 
techniques. 

Furthermore,  a  review  of  both  the  documented  and  likely  effects 
of  various  situational  factors  on  the  evaluation  process  indicated 
the  potential  for  major  problems  unless  the  researcher  is  aware  of  the 
issues.  While  no  direct  relationship  was  found  between  group  size  and 
reliability  or  validity  of  the  evaluations,  it  can  be  assumed  that  very 
small  or  large  groups  will  produce  less  reliable  and  less  valid  scores. 
Group  structure  and  individual  differences  were  found  to  be  sources  of 
potential  problems  which  must  be  monitored  and  dealt  with  by  the 
researcher.  The  popular  issues  of  friendship,  acquaintanceship  and 
type  of  interaction  were  reviewed,  and  there  is  little  evidence  that 
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they  have  a  major  impact  on  the  validity  of  the  scores.  Indications 
are  that  all  techniques  are  relatively  impervious  to  a  variety  of 
situational  factors  with  the  nomination  technique  being  perhaps  the 
most  versatile. 

In  brief,  it  has  been  sho\m  that  peer  evaluations  have  been  a 
fruitful  tool  in  both  research  and  application.  Several  issues  regard¬ 
ing  their  use  remain  to  be  resolved,  but  there  is  sufficient  evidence 
to  suggest  that  these  issues  are  soluble  and  do  not  detract  from  the 
conclusion  that  peer  evaluations  are  a  very  powerful  tool  for 
discriminating  complex  human  behavior. 
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OBJECTIVE  ANALYSIS  OF  CAMOUFLAGE  VIA 
IMAGE  INTERPRETERS 

RONALD  L.  JOHNSON 

US  Army  Mobility  Equipment  Research  and  Development  Center 
Fort  Belvolr,  Virginia  22060 

ABSTRACT.  In  the  past  the  assessment  of  camouflage  effectiveness  by  its 
subjective  nature  has  been  difficult  to  objectively  quantify.  To  accomplish 
thiSj  63  image  interpreters  analyzed  imagery  of  a  missile  site.  Subjects 
reported  v/hich  visual  cues  enabled  site  detection  and  identification.  There 
were  63  detections  and  59  identifications  with  13  visual  cue  categories  for 
detection  and  12  for  identification.  The  frequency  of  response  per  category 
ranged  from  41  to  1  for  detection  and  42  to  1  for  identification.  These 
frequencies  were  analyzed  by  the  statistical  technique  "Minimum  Contrasts" 
at  a  level  of  significance  .05  and  .01.  This  procedure  objectively  determined 
which  target  items  were  well  camouflaged  and  which  needed  improvement. 

I.  INTRODUCTION  . 

The  camouflage  of  military  Installations  is  becoming  increasingly  critical 
as  both  ground  and  aerial  surveillance  techniques  improve.  The  goal  of  the 
camouflage  is  to  increase  the  survivability  of  the  installations,  and. 


395 


simultaneously,  to  be  cost  effective.  There  is  always  the  need  for  a  reliable 
measure  of  the  military  worth  of  camouflage.  This  cannot  be  estimated, 
however,  without  quantifying  the  effects  of  the  applied  camouflage.  In  the 
past,  this  has  been  extremely  difficult  due  to  its  inherent  subjectivity. 

The  purpose  of  this  paper  is  to  demonstrate  a  method  for  that  quantification 
using  the  statistical  technique  "minimum  contrast"  to  obtain  an  item  analysis 
of  the  subjective  cues  identified  by  operational  image  interpreters. 

II.  DESIGN  OF  EXPERIMENT. 

The  SAM  site  selected  for  experimentation  was  situated  in  a  German 
agricultural  area.  Three  levels  of  camouflage  were  applied.  The  first  was 
uncamouflaged.  The  second  consisted  of  tone  down  painting  all  roads  and 
buildings,  plus  construction  of  an  adjacent  decoy  site.  The  third  level  was 
camouflaged  by  simulating  the  surrounding  agricultural  fields  and  trees. 

This  was  accomplished  by  using  camouflage  nets,  directional  plowing, 
grass  herbicide,  and  supplementary  planting  of  shrubs  and  trees.  The  decoy 
site  in  the  second  level  was  removed.  Each  of  the  three  levels  were  photo¬ 
graphed  with  60%  forward  overlap  using  the  following  5"  format  Kodak  film: 

Black  and  White  Plus  X  Kodak  #2402 

Normal  Color  Kodak  #2448 

\ 

Color  Infrared  Kodak  #2443 

The  resulting  imagery  was  cut  into  strips  approximately  15  frames  long,  the 
SAM  site  occupying  at  least  two  of  the  15  frames.  Sixty-three  US  Marine 
Corps  image  interpreters  were  given  thirty  minutes  to  analyze  these  film 
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strips.  Each  level  of  camouflage  and  each  type  of  film  were  viewed  by 
7  randomly  selected  image  interpreters.  Each  interpreter  was  used  only 
once.  The  visual  cues  that  enabled  the  image  interpreters  to  make  a 
detection  and  or  an  Identification  were  recorded  on  the  data  sheet  at  the 
end  of  each  test  session. 

III.  EXPERIMENTAL  RESULTS . 

All  63  of  the  image  interpreters  detected  the  SAM  within  the  alloted 
30  minutes.  Fifty-nine  identified  the  site.  The  interpreters  cited  13  visual 
cues  which  contributed  to  the  site's  detection  and  12  other  visual  cues 
aiding  site  identification.  The  visual  cues  for  both  detection  and  identifi¬ 
cation  are  extrapolations  of  specific  military  aspects  of  typical  cues  of 
psychophysical  stimuli  materials  such  as  size,  shape,  contrast,  texture,  and 
color.  The  cues  cannot  be  identified  in  this  report  due  to  security  classifi¬ 
cation,  but  are  included  in  a  confidential  report  by  the  author  V.  Tables  1 
through  7  contain  these  detection  cues  averaged  across  different  combinations 
of  camouflage  level  and  film  type.  In  addition  each  table  shows  the  frequency 
the  cue  was  reported  by  the  image  interpreters  and  which  cues  are  significantly 
different  from  each  other  at  the  .05  and  .01  level.  This  test  of  significance 
was  calculated  using  the  technique  of  "minimum  contrast"  ^/.  "Minimum 
contrasts"  is  a  method  to  compare  two  proportions  to  determine  whether  the 
observed  contrast  is  significant  at  the  chosen  level.  The  proportions  for  this 
study  are  the  visual  cue  and  the  frequency  the  visual  cue  was  cited  by  the 
interpreters  as  aiding  them  in  site  detection  identification. 
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TABLE  I 

Significant  Differences  in  Detection  Between  Visual  Cues  Averaged  Across 
All  Levels  of  Camouflage  and  Film  Types. 
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* 
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* 

** 


Frequency 

41 

22 

21 

19 

11 

10 

8 

8 

8 

6 

5 

3 

1 


Cell  Size  =  63 

*  =  Significant  Difference  at  a  =  .05 
**=» Significant  Difference  at  a  =  .01 


TABLE  2 


Significant  Differences  in  Detection  Between  Visual  Cues  Averaged  Across 
Film  Types,  Uncamouflaged  Level. 


ABC  D  E  F  G  H  I  J  K  L  M 
A 
B 
C 

jy  ** 

£  **  _ 

p  **  _ 

Q  **  **  _ 

fj  **  _ 

J  ** 

J  ** 

K  ** 

L  **  - 

**  **  _ 

Cell  Size  =  21 

-  =  Border  Line  Significance  at  a  =  .05 
*=  Significant  Difference  at  «=  .05 
**  =  Significant  Difference  at  a  =  .01 


Frequency 

13 

10 

7 

4 

2 

3 

1 

3 

4 
4 
3 
3 
1 
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Significant  Differences  in  Detection  Between  Visual  Cues  Averaged  Across 
Film  Types,  Tone-Down  Plus  Decoy  Level. 


ABC  D  E  F  G  H 
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Cell  Size  =  21 

-  =  Border  Line  Significance  at  a  =  .05 
*  =  Significant  Difference  at  ct  =  .06 
**  “Siginificant  Difference  at  a  =  .oi 


TABLE  4 

Significant  Differences  in  Detection  Between  Visual  Cues  Averaged  Across 
Film  Types ,  Full  Camouflage  Level.  ’  ’ 


ABC  DEFGH 
A 

B  - 
C 

D  * 

.  E  * 

F  ** 

G  ** 

H  ** 

I  ** 

j  ** 

K  ** 

L  **  *  *  _  * 

**  *  *  _  * 


I  J  K  L  M  Frequency 

14' 

6 

8 

5  . 

6 
2 
4 
3 
3 
2 
2 
0 
0 


Cell  Size  =  21 

-  =  Border  Line  Significance  at  “  =  .05 
*  =  Significant  Difference  at  “  =  .05 
**  =  Significant  Difference  at  ^  =  .01 
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table  5 


Significant  Differences  in  Detection  Between  Visual  Cues  Avdraged  AcrosS 
Camouflage  Levels,  Film  Type  -  B&AV  Plus  X. 


A 

B 
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F 
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G  H  I  J 


K  L 
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6 

4 
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3 

5 
1 
1 
2 
1 
0 


Cell  Size  =  21 

■-  =  Border  Line  Significance  at  «  =  .05 
*  =  Significant  Difference  at  «  =  .05 
**  =  Significant  Difference  at  “  =  .01 

TABLE  6 


Significant  Differences  in  Detection  Between  Visual  Cues  Averaged  Across 
Camouflage  Levels,  Film  Type  -  Color. 
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1 


Cell  Size  =  21 

-  =  Border  Line  Significance  at  a  =  .05 
*  =  Significant  Difference  at  =  .05 
**  =  Signicant  Difference  at  “=  .01 
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TABLE? 

Signlficdnt  Difference  in  Detection  Between  Visual  Cues  Averaged  Across 
Camouflage  Levels,  Film  Type  -  Color  IR. 


A  B  C 
A 
B 
C 
D 
E 
F 

G  **  * 

H  ** 

I  ** 

J  ** 

K  ** 

L  ** 

M  **  *  ** 


D  E  F  G  H  I  J  K  L 


* 


** 


M  Frequency 
12 
6 
8 
8 
5 
5 
1 
3 
2 
3 
2 
2 
0 


Cell  Size  -  21 

■-  =  Border  Line  Significance  at  “=  ,05 
*  =  Significant  Difference  at  a  =  .05 
**«  Significant  Difference  at  a  =  ,01 

Tables  8  through  14  contain  the  12  visual  cues  which  contributed  to  site 
identification  averaged  across  different  combinations  of  camouflage  and 
film  type.  Cue  frequency  and  significance  at  a  =  .05  are  again  included 
as  in  the  preceding  tables.  As  before,  the  cues  cannot  bo  identified 
because  of  security  classification. 
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SignlficBnt  Differences  in  Identification  Between  Visual  Cues  Averaged 
Across  All  Levels  of  Camouflage  and  Film  Types  , 

A  B  C  D  E  F  G  H  I  J  K  L  Frequency 


B 

** 

C 

** 

15 

D 

** 

•k 

13 

E 

** 

kk 

8 

F 

** 

kk 

8 

G 

** 

kk 

8 

H 

** 

kk 

k 

k 

4 

I 

** 

kk 

kk 

kk 

2 

J 

** 

kk 

kk 

kk 

2 

K 

** 

kk 

kk 

kk 

k 

k 

k 

1 

T. 
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1 

Cell  Size  =  59 

*  =  Significant  Difference  at  a  =  .05 
**=  Significant  Difference  at  a  =  .01 

TABLE  9 


Significant  Differences  in  Identification  Between  Visual  Cues  Averaged 

Across  Film  Types  ,  Uricamouflaged  Level. 


ABC  D  E  F  G 

A  . 

B  ** 

D  ** 

D  ** 

E  ** 

p 


G  ** 
H  ** 
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H  I  J  K  L  Frequency 

17 
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2 

7 

5 

6 
4 
2 
1 
1 
0 
0 


Cell  Size  =  17 

*  =  Significant  Difference  at  oi  =  .05 
**=  Significant  Difference  at  a  =  .01 
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Significant  Differences  in  Identification  Between  Visual  Cues  Averaged 
Across  Film  Types,  Tone-Down  Plus  Decoy  Level, 

AB  C  DEFGHIJKL 
A 
B 
C 

D  * 

E  ** 

p  **  * 

G  ** 

*  *  * 

J  **  *  ** 

J  **  * 

**  * 

**  *  ** 

Cell  Size  =  17 

*  =  Significant  Difference  at  “  =  .05 
**=  Significant  Difference  at  a  =  .01 

TABLE  11 

Significant  Differences  in  Identification  Between  Visual  Cues  Averaged 
Across  Film  Types,  Full  Camouflage  Level. 
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2 
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1 

0 

0 

1 


Cell  Size  =17 

*  =  Significant  Difference  at  a  =  .05 
**  =  Significant  Difference  at  «  =  .01 
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TABLE  12 


Significant  Differences  in  Identification  Between  Visual  Cues  Averaged 

Across  Camouflage  Levels,  Film  Type  -  B&W  Plus  X. 


ABC  D  E  F 
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B 
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E 
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F 
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G 
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H 

** 
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J 
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K 
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T. 

** 
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■■  15 
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1 
2 
6 
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0 


Cell  Size  =  17 

*  =  Significant  Difference  at  »  =  .05 
**=  Significant  Difference  at  a  =  .01 

TABLE  13 


Significant  Differences  in  Identification  Between  Visual  Cues  Averaged 
Across  Camouflage  Levels,  Film  Type  -  Color. 
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**=  Significant  Difference  at  “ 
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1 
1 
1 


.05 

.01 
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TABLE  14 

Significant  Differences  In  Identification  Between  Visual  Cues  Averaged 
Across  Camouflage  Levels,  Film  Type  -  Color  IR. 


A  B  C  D  E  F  G 
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2 
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1 
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0 


Cell  Size  =  17 

*  =  Significant  Difference  at  a  =  .05 
**=  Significant  Difference  at  a  =  ,01 
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IVi  DISCUSSION. 

A  review  of  tables  1-7  demonstrates  that  the  isolation  of  the  critical 
visual  cues  for  site  detection  was  accomplished  by  the  use  of  "minimum 
contrasts."  Detection  cue  A  was  a  significant  factor  in  all  tables  for  the  ,  ' 

detection  of  the  SAM  site.  There  was  virtually  no  change  in  the  importance 
of  this  cue  in  site  detection  when  analyzed  across  levels  of  camouflage 
and  film  type.  Therefore, more  effort  must  be  expended  to  prevent  this  cue 
from  becoming  a  major  factor  in  target  detection.  The  addition  of  the  decoy 
site  adjacent  to  the  SAM  site  had  a  pronounced  effect  in  increasing  the 
number  of  significant  cues  that  allowed  the  image  interpreter  to  detect  the 
site  (table  3).  Visual  cues  E  and  F ,  and  H  through  M  did  not  have  much 
effect  on  site  detection  either  for  level  of  camouflage  or  type  of  film  analyzed. 
The  number  of  cues  leading  to  site  detection  was  greater  for  the  color  and 
color  infrared  film  than-  for  the  black  and  white  film,  (tables  5-7).  As  is  well 
known,  more  Information  is  presented  to  the  image  interpreter  on  color  and 
color  Infrared  film  than  on  black  and  white  imagery. 

Tables  8-14  indicate  that  the  use  of  "minimum  contrasts"  to  isolate 
the  critical  visual  cues  in  the  identification  of  the  SAM  site  was  successful. 
Visual  cues  Important  for  site  identification  were  different  from  those  for  site 
detection.  Identification  cues  A  and  B  were  the  most  important  except  for 
camouflage  level  two  containing  tone-down  and  site  decoy.  For  this 
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case ,  cues  A  and  C  were  the  most  prominent  in  site  identification  (table  10) . 
The  effects  of  visual  cue  C  were  essentially  nil  for  leyels  one  and  three 
(Tables  9  and  11) .  Visual  cues  D  through  L  had  little  effect  on  site  identi¬ 
fication  when  analyzed  by  level  of  camouflage  or  type  of  film .  Color  infrared 
film  generated  more  visual  cues  to  target  identification  (Table  1^  than  both 
color  and  black  and  white  films  (Tables  12-13) ,  We  consider  this  to  be 
due  to  the  greater  amount  of  information  presented  to  the  image  Interpreter 
with  color  infrared  film  than  for  the  other  two  film  types..  The  results 
indicated  that  this  approach  was  a  valid  method  to  objectively  analyze 
subjective  cues. 

V.  SUMMARY  AND  CONCLUSIONS  . 

The  problem  faced  in  this  study  was  to  objectively  analyze  the  effects 
of  levels  of  camouflage  on  detection  and  identification.  A  SAM  site  was 
selected  and  photographed.  Subjective  visual  cues  were  elicited  from 
operational  image  Interpreters  in  response  to  specially  prepared  classified 
packets  of  site  photography.  These  cues  for  both  detection  and  identifi- 
cation  were  grouped  into  categories  and  analyzed 'for:significance  using  the 
technique  of  "minimum  contrasts"  ,  This  technique  facilitated  the  . 
quantification  of  the  subjective  cues  used  by  image  interpreters  in  site 
detection/identification  for  levels  of  camouflage  and  types  of  photographic 
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A  SIMPLE  METHOD  FOR  DETERMINING  THE 
UNRESTRICTED  AVERAGE  OUTGOING  QUALITY 
LIMIT  (UAOQL)  OF  A  CONTINUOUS  SAMPLING  PLAN 

Richard  M.  Brugger 
RAM  Assessment  Division 
Product  Assurance  Directorate 
US  Army  Armament  Command 
Rock  Island,  Illinois 


ABSTRACT.  This  paper  provides  a  simple  algorithm  for 
determining  the  Unrestricted  Average  Outgoing  Quality 
Limit  (UAOQL)  of  a  continuous  sampling  plan.  The  deriva¬ 
tion  of  the  algorithm  is  shown. 

1.1  INTRODUCTION.  As  a  prerequisite  to  a  discussion 
of  the  UAOQL,  some  review  of  the  fundamentals  of  continu¬ 
ous  sampling  is  in  order. 

Most  statisticians  are  familiar  with  the  concept  of 
sampling  from  a  lot.  For  example,  we  might  have  a  lot  of 
one  hundred  items,  from  which  a  sample  of  size  seven  has 
been  drawn.  The  acceptance  decision  for  the  lot  will  be 
based  on  the  results  found  in  the  sample.  For  example, 
the  rules  of  the  sampling  plan  we  are  using  might  say  that 
if  two  or  fewer  units  out  of  the  sample  of  size  seven  are 
defective,  we  shall  accept  the  lot.  If  three  or  more  units 
are  defective,  we  shall  reject  the  lot. 

Under  continuous  sampling,  we  do  not  use  the  concept 
of  sampling  a  certain  number  of  units  from  a  lot  of  material 
Instead,  we  carry  out  inspection  as  the  gnits  are  produced 
and  flowing  along  the  production  line. 

The  prerequisites  for  using  a  continuous  sampling  plan 

are; 


a.  Moving  product. 

b.  Ample  physical  facilities  for  100%  inspection  when 
necessary. 

c.  Relative  ease  of  inspection  . 

d.  A  process  capable  of  producing  homogeneous  material 
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An  example  of  a  continuous  sampling  plan  is  Harold 
Dodge’s  CSP-l  [2].  Dodge  was  the  original  developer  of 
continuous  sampling  plans,  and  published  his  first  work  * 
on  the  subject  in  1943.  Under  CSP-l,  at  the  start  of  ^ 
inspection,  the  screening  crew  inspects  100%  of  the  units. 
When  some  prespecified  number,  i,  of  consecutive  units  are 
free  of  the  defects  concerned,  that  is,  the  defects  for 
which  we  are  inspecting,  the  screening  crew  is  released 
from  100%  inspection  and  the  sampling  inspector  inspects 
a  prespecified  fraction,  f,  of  the  units,  where  the  sample 
units  are  selected  in  a  random  manner  as  they  pass  the 
point  of  inspection.  If  a  defective  unit  is  found,  100% 
inspection  is  resumed,  and  the  cycle  repeats  itself  as 
necessary  during  the  remainder  of  production. 

We  made  mention  of  the  values  of  i  and  f,  which  are 
specified  for  each  individual  CSP-l  plan.  For  ex^ple,  _ 
we  might  have  a  clearance  number,  i,  of  twenty  units,  and 
a  sampling  frequency,  f,  of  one  in  ten. 

Some  of  the  functional  properties  of  a  CSP-l  plan 
(or  any  CSP  plan  for  that  matter)  that  are  usually  of 
interest  to  the  statistician  are  the  following: 

a.  The  Average  Fraction  Inspected,  of  AFI,  which  is 
the  expected  value  of  the  fraction  of  material  that  will 
be  inspected  over  an  indefinitely  long  period  of  time  when 
each  unit  has  probability  p  of  being  defective. 

b.  The  Average  Outgoing  Quality,  or  AOQ,  which  is 
the  expected  fraction  of  material  that  is  dofsotive  in  ^ 
accepted  material  over  an  indefinitely  long  period  of  time 
when  each  unit  has  probability  p  of  being  defective. 

c.  The  Average  Outgoing  Quality  Limit ,  or  AOQL, 
which  is  the  maximum  value  of  AOQ. 

Thus  far,  we  are  talking  about  properties  based  on 
the  assumption  that  each  unit  has  probability  p  of  being 
defective.  This  is  of  course  a  very  restrictive  assump¬ 
tion,  since  one  might  intuitively  feel  that  in  the  real 
life  situation,  p  would  undergo  some  sort  of  variation 
over  time.  For  this  reason,  statistici^s  concerned  them- 
gQ2.ves  with  the  problem  of  how  to  describe  the  mathematical 
properties  of  continuous  sampling  plans  when  p  varied  over 
time.  In  1953,  Lieberman  [4]  presented  an  analysis  of 


410 


CSP-1  under  the  assumption  that  p  was  not  constant  for  each 
unit.  He  determined  that  the  worst  situation  would  be  the 
one  where  only  good  units  reached  the  inspector  during 
phases  of  100%  inspection,  and  only  bad  units  reached  the 
inspector  during  phases  of  sampling  inspection. 

The  outgoing  quality  reflected  by  this  worst  possible 
situation  eventually  ceune  to  be  called  the  Unrestricted 
Average  Outgoing  Quality  Limit,  or  UAOQL.  There  is  a  very 
interesting  paper  on  the  UAOQL  by  Sackrowitz  [5]  in  the 
April  1975  Journal  of  Quality  Technology;  however, 

Sackrowitz 's  definitions  are  somewhat  different  from  what 
we  will  discuss  here. 

There  are  two  general  cases  that  we  will  consider: 
that  situation  where  defective  units  found  are  removed  from 
the  flow  of  product  and  replaced  with  good  units,  and  the 
situation  where  defective  units  found  are  removed  from  the 
flow  of  product  but  are  not  replaced  with  good  units. 

For  the  replacement  case.  White  [6]  carried  out  a 
quite  complex  derivation  involving  linear  programming  to 
show  that  for  a  broad  class  of  plans,  the  UAOQL  would  re¬ 
sult  from  that  situation  where  fOr  any  phase  of  inspection 
of  a  plan,  either  all  good  units  are  submitted  during  every 
occurrence  of  the  phase  or  all  bad  units  are  submitted 
during  every  occurrence  of  the  phase.  White  [7]  computed 
nvutnerical  solutions  for  plans  from  DOD  Handbook  H106. 

Endres  [3] ,  an  employee  of  mine  at  the  time,  showed  that 
this  rule  would  apply  also  in  the  case  where  defective  units 
were  removed  from  the  flow  of  product,  but  were  not  replaced 
with  good  units. 

2.  DISCUSSION.  With  the  difficult  mathematical  proofs 
thus  out  of  the  way,  the  possibility  of  developing  a  simple 
algorithm  presented  itself  [1] .  The  phases  of  inspection 
could  be  treated  afs  states  in  a  Markov  chain.  Remember  that 
the  UAOQL  will  result  from  a  situation  where  for  every  occur 
rence  of  each  phase,  either  only  all  nondefectives  are  sub¬ 
mitted  for  inspection,  or  only  all  defectives  are  submitted 
for  inspection. 
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Let  us  define  configurations  to  be  the  values  of 

where 

k  *  number  of  states 

fc(j)  =  0  if  during  occurrences  of  the  phase  represented 
^  by  state  j  only  nondefectives  are  submitted  for 

inspection. 

<J>(j)  =  1  if  during  occurrences  of  the  phase  represented 
^  by  state  j  only  defectives  are  submitted  for 

inspection. 

It  is  clear  that  for  any  plan  of  the  type  we  are  considering, 
then,  there  will  be  2^  configurations.  For  even  moderate 
sized  values  of  k,  the  problem  could  be  difficult  if  we  had 
to  consider  every  configuration.  Fortunately,  we  can  make 
the  problem  smaller. 

Let  us  first  go  through  the  case  where  defective  units  are 
removed  and  replaced  with  good  units. 

Theorem  1:  If  a  configuration  exists  such  that  for  any 
state  j 

(i)  <j>(j)  =  0,  and 

(ii)  State  j  is  an  absorbing  barrier, 

then  this  configuration  need  not  be  considered  in  deter¬ 
mining  the  UAOQL. 

Proof;  Under  the  conditions  stated  in  the  theorem,  the 
long  run  outgoing  quality  would  be  zero. 

Theorem  2:  If  a  configuration  exists  such  that  for  any 
state  j 

(i)  <|>(j)  =  1,  and 

(ii)  State  j  is  an  absorbing  barrier, 

then  this  configuration  need  not  be  considered  in  deter¬ 
mining  the  UAOQL. 
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Proof  :  Under  the  conditions  stated  in  the  theorem,  the 
long  run  outgoing  quality  would  be  zero. 

We  thus  see  that  all  configurations  involving  absorbing 
barriers  can  be  disregarded. 

Consider  CSP-1.  Let  state  1  be  the  100%  inspection  state 
and  state  2  be  the  sampling  state.  We  have  the  following 
configurations: 

=  (0,*  0) 

y  =  (0,  1) 

.  .,2  '  .  . 

y3  =  (1/  0) 

y4  =  (1,  1) 

Configurations  with  <^(1)  =  1  or  <)>(i2)  =  0  can  be  disregarded, 
since  these  would  represent  absorbing  barrier  situations. 
Therefore  yi,  y3 ,  and  y4  can  be  disregarded.  The  remaining 
configuration,  y2»  represents  the  situation  under  which  the 
UAOQL  occurs;  no  defective  units  are  submitted  during  periods 
of  100%  inspection,  only  defective  units  are  submitted  during 
periods  of  sampling  inspection. 

Let  us  now  define  another  term. 

A  sequence  of  states  which  repeats  itself  indefinitely  under 
the  conditions  imposed  shall  be  called  a  cycle.  For  example, 
if  a  Markov  chain  consists  of  four  states,  and  if  a  configura¬ 
tion  results  in  a  sequence  of  states  (1,  2,  3,4,  3,  4,  3, 

4  ...),  then  (3,  4)  is  a  cycle. 

Theorem  3:  The  long  run  outgoing  quality  for  a  configura- 
tion  involving  cycles  is  equal  to  the  average  number  of 
defectives  passed  in  a  cycle  divided  by  the  average  number 
of  units  passed  in  a  cycle. 

Proof:  The  long  run  outgoing  quality  is  , 


m 

T 

lim  i=l  defectives  passed  in  cycle  i 
m-»-“  m 

i=l  units  passed  in  cycle  i 
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lim  m (AVERAGE  NUMBER  OF  DEFECTIVES  PASSED  IN  A  CYCLE) 
ni-»-c»  m  (AVERAGE  NUMBER  OF  UNITS  PASSED  IN  A  CYCLE) 


AVERAGE  NUMBER  OP  DEFECTIVES  PASSED  IN  A  CYCLE 
AVERAGE  NUMBER  OF  UNITS  PASSED  IN  A  CYCLE 

Considering  CSP-1  again,  it  has  been  shovm  that  only  configura¬ 
tion  y2  =  (0,  1)  need  be  considered.  Since  the  sequence  of 
states  (1,  2.  1,  2,  ...)  occurs,  we  may  refer  to  (1,  2)  as 
a  cycle.  Using  Theorem  3,  we  may  then  say  that 

AVERAGE  NUMBER  OF  DEFEC-  .  AVERAGE  NUMBER  OF  DEFEC- 
-  TIVES  passed  during  100%  TIVES  PASSED  DURING  SAMPLING 

UAOQL  -  average  NUMBER  OF  UNITS  T  AVERAGE  NUMBER  OF  UNITS 
PASSED  DURING  100%  PASSED  DURING  SAMPLING 


+ 


0  +  (-5-  -  1) 

i  +  (-4-) 


where  i  is  the  length  of  100%  inspection  and  f  is  the 
sampling  frequency.  Let  us  now  consider  the  case  where 
defective  units  found  are  removed  but  not  replaced  with 
good  units . 


Theorem  1';  If  a  configuration  exists  such  that  for  any 
state  j 

(i)  ^(j)  =  0,  and 

(ii)  State  j  is  an  absorbing  barrier, 

then  this  configuration  need  not  be  considered  in  determining 
the  UAOQL. 

Proof:  Under  the  conditions  stated  in  the  theorem,  the 
long  run  outgoing  quality  would  be  zero. 

We  see  that  this  is  the  same  as  Theorem  1  for  the  replace¬ 
ment  case. 

Theorem  2 ' :  If  a  configuration  exists  such  that  for  state 

1  (corresponding  to  the  first  phase  encountered 
in  using  the  sampling  plan) 

(i)  ^(1)  =  1,  and 

(ii)  State  1  is  an  absorbing  barrier. 

Then  this  configuration  need  not  be  considered  in  deter¬ 
mining  the  UAOQL. 
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Proof;  Under  the  conditions  stated  in  the  theorem,  no 
product  would  be  passed  at  all,  hence,  outgoing  quality  would 
not  be  defineable. 

If  the  number  of  units  passed  in  a  cycle  is 
greater  than  zero,  then  the  long  run  outgoing 
quality  for  a  configuration  is  equal  to  the 
average  number  of  defectives  passed  in  a  cycle 
divided  by  the  average  number  of  units  passed 
in  a  cycle. 

as  Theorem  3  for  the  replacement  case. 

If  a  cycle  passes  zero  units,  it  is  only  neces¬ 
sary,  in  determining  the  long  run  outgoing 
quality,  to  consider  those  states,  if  any, 
which  occur  before  cycling  begins. 

Proof;  The  fraction  defective  of  material  passedby  the 
inspection  system  would  remain  unchanged  once  cycling  begins, 
since  no  more  units  would  be  passed.  This  theorem  is  useful 
when  a  100%  inspection  state  other  than  state  1  is  an 
absorbing  barrier. 

As  an  example,  let  us  consider  the  simple  case  of  CPS-1 
again  under  the  nonreplacement  assumption.  We  have  the 
configurations 


yi  = 

(0, 

0) 

^2  = 

(0, 

1) 

^3  = 

(1, 

0) 

I! 

>1 

dr 

1) 

Configurations  with  cl(l)  =  1  or  <^(2)  —  0  can  again  be  dis¬ 
regarded,  since  these  would  represent  absorbing  barrier 
situations  with  no  defective  units  passing.  Again  y 2  = 

(0,  1)  is  the  only  configuration  that  heed  be  considered. 

In  the  replacement  case,  then, 

AVERAGE  NUMBER  OF  DEFEC-  .  AVERAGE  NUMBER  OF  DEFEC- 
j  TIVES  PASSED  DURING  100%  TIVES  PASSED  DURING  SAMPLING 

UAOQL  -  average  NUMBER  OF  UNITS  "T  AVERAGE  NUMBER  OF  UNITS 

PASSED  DURING  100%  PASSED  DURING  SAMPLING 

0  +  ( — j —  —  1)  1  —  f 

'  i .  ■  *(i- 11  ^  1; 


Theorem  3 ' ; 


Proof;  Same 
Theorem  4 ' ; 
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In  our  examples,  we  have  used  the  simplest  case,  CSP-1. 
However,  in  practice,  we  have  found  that  we  can  use  this 
method  for  plans  of  some  complexity  in  order  to  determine 
the  UAOQL  for  either  the  replacement  or  the  nonreplacement 
case. 
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ABSTRACT.  Given  a  Markov  Chain  (MC)  model  for  a  particular  Continuous 
Sampling  Plan  (CSP) ,  a  method  of  restructuring  its  states  into  a  simpler 
Semi  Markov  Chain  (SMC)  pattern  is  used  to  analyze  MC  functionals  which 
are  partially  defined  by  random  backward  shifts  in  operational  time.  ^ 

Specifically,  the  usual  MC  model,  for  the  Job  Shop  case  of  CSP-1, 
initially  starts  with  an  inspection  phase  of  I  states  and  thereafter  cycli¬ 
cally  alternates  between  it  and  a  sampling  phase.  However,  whenever  sampling 
is  terminated ,  this  plan  is  modified  by  the  additional  requirement  of  a 
(limited)  Downstream  Inspection  (DSI)  of  the  previous  I  units  followed  by  a 
phase  transition  determined  by  the  outcome  of  such  an  inspection.  For  a  pro¬ 
duction  run  of  length  N,  this  modification  induces  a  corresponding  one  in  the 
expected  value  of  the  associated  basic  functional:  Fraction  Inspected  [FI(N;1)]. 
Both  modifications  are  handled  here  by  1.)  slightly  changing  the  usual  SMC 
reduction  and  2.)  coupling  this  change  with  a  new  functional:  Incremented 
Fraction  Inspected  [IFI(N;2) ] .  The  expected  value  of  the  functional  Total 
Fraction  Inspected  [TFI(N;2)]  is  then  expressed  as  the  expected  value  of  the 
sum  of  two  terms:  the  new  functional  and  the  (unmodified)  functional,  FI(N;2), 
defined  on  the  altered  SMC.  In  addition  to  comparing  the  long  run  expressions 
for  TFI  and  AFI,  a  comparison  is  also  made  between  TFI  and  the  expression  which 
results  from  the  more  familiar  requirement  of  (limited)  Upstream  Inspection 
(USD . 

In  analyzing  the  above  situation  for  finite  N,  two  interpretations  of 
DSI  are  subsequently  studied.  The  first,  based  on  possible  inspection  or 
manufacturing  irregularities  in  both  phases,  is  the  scheme  already  referred 
to  above.  The  second,  based  only  on  the  putative  assumption  of  sampling  phase 
irregularities,  is  a  less  strict  version.  For  N  infinite,  a  comparison  is  made 
between  the  expected  values  of  the  two  TFI’s. 

Since,  in  either  of  the  two  plans,  TFI  does  not  explicitly  take  account 
of  multiple  inspections  of  the  same  unit,  other  measures  of  plan  performance 
are  considered  which  do.  To  this  end,  the  paper  concludes  with  a  study  of  the 
functional  Fraction  of  Repetitions  (FR) ,  its  first  moment,  and  a  variant  func¬ 
tional.  In  order  to  deal  with  this  functional,  further  modifications  in  the  SMC 
model  for  CSP-1  are  necessary. 
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1.0  BACKGROUND. 


1.1  Introduction.  The  principal  subject  of  this  paper  is  the  study  of 
variations  in  one  member  of  a  class  of  sampling  plans  and  functionals  de¬ 
fined  on  these  variations.  The  class  referred  to  is  that  of  certain 
Continuous  Sampling  Plans  (CSP)  which  are  treated  as  finite  state,  irre¬ 
ducible,  time  homogeneous,  and  aperiodic  Markov  Chains  (MC) .  The  element 
referred  to,  classically  denoted  by  CSP-1,  is  the  simplest  element  of  this 
class.  In  dealing  with  these  MC  models,  four  different  kinds  of  standard 
groupings,  called  phases,  of  their  states  can  be  distinguished:  screening 
(sc),  unlimited  sampling  (uls) ,  limited  sampling  (Is),  and  checking  (ck) . 
Using  the  terminology  of  phases,  a  given  CSP  can  then  be  defined  as  a 
collection  of  two  or  more  different  phases  (normally,  one  of  which  is  sc) 
which  are  linked  together  in  accordance  with  sampling  frequency  criteria. 
Throughout  the  bulk  of  the  paper,  only  the  two  canonical  phases  making  up 
CSP-1  will  be  considered;  interest  will  be  especially  focused  on  structural 
changes  in  uls  which  are  brought  about  by  Downstream  Inspection  (DSI) .  At 
the  end  of  Chapter  3,  the  checking  phase  will  also  be  briefly  considered 
since  it  can  be  regarded  as  Upstream  Inspection  (USI) . 

CSP-1  and  the  major  variation  in  it,  brought  about  by  DSI,  are  portrayed 
in  Figure  1. 


Figure  1 
CSP-1  and  DSI 

if  defect  is 


418 


In  Figure  1,  CSP-1  consists  of  the  top  two  boxes  connected  together 
with  the  solid  lines.  The  DSI  plan,  denoted  by  CSP-12,  is  obtained  from 
CSP-1  by  replacing  the  top  solid  line  by  the  dotted  ones  and  adding  the 
lower  box.  Two  approaches  will  be  used  to  handle  this  change. 

The  first  approach,  given  in  Chapters  2  and  3,  consists  in  counting 
only  the  extra  units  inspected  without  regard  to  any  inspection  repetitions 
due  to  DSI.  In  the  second  approach,  given  in  Chapter  4,  all  units  inspected 
are  also  counted,  but  now  including  repetitions.  Both  approaches  use,  as 
the  main  tool.  Semi  Markov  Chain  (SMC)  reduction  of  MC  models  which  is  now 
briefly  described. 

In  describing  SMC  reduction,  the  term  macrostate  will  be  used  to  refer 
to  an  ensemble  of  MC  states  which  is  structured  as  a  (discrete)  SMC  state 
(e.g.,  a  canonical  phase  of  a  CSP) .  To  be  a  macrostate,  an  ensemble  must 
satisfy  two  conditions.  1.)  The  MC  probability  of  entrance  vector  (pev) 
into  the  ensemble,  given  that  such  an  entrance  occurs,  must  be  stationary 
and  independent  of  the  state  from  which  the  entrance  is  made.  In  other 
words,  letting  the  ensemble  S  be  composed  of  the  k  MC  states,  j,  1  2  j  ^  k, 
we  impose  the  condition  that,  for  an  arbitrary  time  n, 

v(n)  =  V 


where 


v(n)  =  (vi(n),  V2(n),  - ,  vk(n)) 


Vj(n)  =  P[M(n)  =  j|M(n)  in  S,  M(n-l)  not  in  S] 


and 


M(0  is  the  MC  process. 

2.)  Subject  to  the  restrictions  of  1.)  for  a  given  target  macrostate,  an 
exit  can  occur  from  a  MC  state  of  the  ensemble  into  a  MC  state  of  the 
macrostate  only  if  the  first  state  communicates  with  the  second  in  the 
underlying  MC.  To  avoid  a  circular  construction,  we  finally  note  that 
any  MC  state  is,  itself,  a  (trivial)  macrostate. 

Two  different,  but  equivalent,  methods  can  be  used  to  construct  such^ 
macrostates  from  MC  states:  the  MC  method,  which  is  pedestrian,  but  straight¬ 
forward,  and  the  SMC  method,  which  is  more  subtle  but  nearer  to  the  general 
idea  of  SMC  reduction.  Under  either  method,  MC  functionals  induce  well  defined 
SMC  ones  and  the  MC  properties  of  time  homogeneity,  irreducibility ,  and 
aperiodicity  are  preserved  [cf.,  6.2  and  6.8]. 
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In  the  MC  method,  the  component  states  of  a  given  macrostate  and 
the  possible  exit  macrostates  are  the  transient  states  and  the  absorbing 
states,  respectively,  of  an  absorbing  MC  which  is  derived  from  a  parti-- 
tioning  of  the  original  MC,  The  possibly  defective  probability  density 
function  (pdf)  of  a  transition  of  the  macrostate  to  any  one  of  the  target 
macrostates  is  then  just  the  weighted  sum  of  the  first  entry  probability 
functions,  each  weighed  by  the  component  in  the  stationary  pev.  In  the 
more  constructive  SMC  method,  a  given  parent  macrostate  is  considered  to 
be  made  up  of  two  or  more  smaller  macrostates  (including  a  MC  state  with 
or  without  self  transitions).  To  such  a  division,  the  "MC  method"  is 
applied,  only  now  to  an  absorbing  SMC.  The  derived  system  of  Backward 
Equations  (see  A. 21),  or,  in  simpler  situations,  direct  combinatorial 
analysis  is  then  applied  to  obtain  the  resulting  first  entry  SMC  probabil¬ 
ity  functions.  Their  weighted  sum,  again  weighed  by  components  of  the 
(induced)  stationary  pev,  yields  the  pdf  of  the  parent  macrostate  (to  some 
one  target  macrostate).  This  latter  method  is  easier  to  use  and  intuitively 
more  appealing;  it  will  be  used  almost  exclusively  throughout  this  paper 
except  for  a  simple  example  of  the  MC  method  given  at  the  end  of  Chapter  1# 
Furthermore,  the  SMC  method,  at  any  stage  in  its  use,  emphasizes  the  concepts 
1.)  of  constructing  from  a  given  MC  a  class  of  SMC^s  which  is  partially 
ordered  by  filtration  [6.2  and  6.7]  and  2.)  of  using  different  elements  of 
this  class  to  attack  either  different  problems  or  different  stages  of  one 
problem  which  arise  from  the  original  MC. 

Neither  of  these  two  methods  should  be  confused  with  the  process  of 
lumping  as  it  is  defined  in  [6.13].  In  fact,  for  CSP’s,  it  is  not  possible 
to  lump  the  states  in  each  phase,  in  the  above  sense,  into  a  new  MC  state. 

A  more  thorough  presentation  of  SMC  reduction,  with  many  applications,  can 
be  found  in  [6.2].  What  notation,  definitions,  and  theorems  concerning  SMC’s 
that  are  needed  in  this  paper  are  taken  from  this  reference  and  can  be  found 
in  the  Appendix.  A  more  heuristic  approach  to  SMC  reduction  (for  the 
stationary  case)  together  with  further  applications  can  be  found  in  [6.4, 

6.5,  and  6.6]  where  it  is  called  The  Simplified  Markov  Chain  Method. 

In  summary,  the  MC  method  can  be  stated  as  follows.  Given  the  compo¬ 
nents,  Vj,  of  the  pev  and  the  MC  first  entrance  probability  function 


from  j  to  a  target  macrostate  (absorbing  state)  A,  the  equation  for  the  pdf 
from  S  to  A  is  (see  Appendix  for  notation) 

k 


(Al) 


Similarly,  the  SMC  method  leads  to  the  same  form  for  the  RHS  of  Eq.  A1 
in  which  the  f ’s  are  replaced  by  SMC  first  entrance  probability  functions. 

Another  ubiquitous  tool,  used  in  concert  with  SMC  reduction,  is  the 
z-transform.  The  transform  is  applied  here  to  probability  sequences 
rather  than  to  the  transitional  matrices  themselves.  This  approach  is 
taken  because,  in  practical  applications,  the  ranks  of  the  matrices  are 
quite  large  (about  3  x  10^  and  greater) .  Thus  the  ranks  of  the  complex 
functional  matrices,  obtained  via  the  transform,  would  be  so  large  that 
1.)  important  relationships  would  be  obscured  and  2.)  an  analysis  of  them 
would  be  almost  as  difficult  as  that  done  without  the  transform.  The 
salient  features  of  the  transform  can  be  found  in  [6.3  and  6.12].  We  re^ 
cord  here  only  some  basic  notation  that  will  be  used  with  sequences  treated 
as  functions  from  the  natural  numbers  to  the  reals.  Given  a  sequence  a(n), 
a(z)  is  its  z-transform.  Given  sequences  a(n)  and  b(n),  a*b(n)  is  their 
convolution,  denotes  the  (Dirac)  sequence  which  is  one  for  the 

argument  equal  to  n  and  zero  otherwise;  =  1/z^.  Hj^(k)  denotes  the 

(Heaviside)  sequence  which  is  one  for  the  argument  greater  than  or  equal 
to  n  and  zero  otherwise;  Hn(z)  =  0-/z^) (z/ (z-1)) , 

1.2  SMC(l)  and  FI(N;1).  The  basic  premise  used  in  modelling  a  CSP  is 
that  the  underlying  production  process  is  a  Bernoulli  process  with  a  con¬ 
stant  probability  of  defective  p  (and  probability  of  non-defective  q  =  1-p) . 
In  particular,  the  MC  structure  of  CSP-1,  which  is  fully  described  in  [6.1, 
6.2,  and  6.4],  arises  from  the  sequential  sampling  scheme  imposed  on  the 
above  process  with  an  operational  time  defined  by  the  flow  of  non-repeating 
production  units.  The  SMC  model  of  CSP-1,  derived  from  the  MC  model,  is 
given  in  Figure  2  and  is  denoted  by  SMC(l). 


Figure  2 

SMC  Model  of  CSP-1  (SMC(l)) 


For  the  model  in  Figure  2,  I  =  clearance  number  for  sc,  f  =  sampling 
frequency  for  uls,  p  =  probability  of  defective,  q  =  1-p,  and  we  have  the 
following  statements  expressed  in 

Theorem  1.  Let  sc  =  1  and  uls  =  2,  Then,  SMC(l)  is  an  irreducible 

SMC. 


Proof.  The  SMC  states  are 
(1;  Qi2(z))  and  (2;  Q2i(z)), 


where 

Qi2(z)  ^ 

In  Eqs.  1.1,  (|>(z)  =  z^Cz-l)  +  y,y  =  pq^,  6  =  fp,  and  3  =  l-«- 

The  transitional  matrix  of  the  embedded  MC  is 

1  2 

1 
2 

Even  though  it  clearly  has  period  2,  the  SMC  is  none-the-less  aperiodic 
[6.2  and  6.8].  It  easily  follows  from  the  matrix  that  e  -  (1/2,  1/2)  is 
the  stationary  (but  not  long  run)  vector. 

Using  the  notation  in  the  Appendix  (see  A. 25), 


and  y2  = 


The  last  two  statements  and  A. 25  imply 


Pj(a>;l)=  - 1 -  andP2(“;l)  = 

P1  +  P2 


V2 


Pjd-  P2 
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Further  details  are  found  in  [6,2]  which  finishes  the  proof- 

The  expressions  (l^q^)/pq^  and  1/fp,  in  Theorem  1,  appear  throughout 
the  paper  and  will  hereafter  be  abbreviated  by  the  symbols  y\  and  P2  > 
respectively.  These  special  primed  s3nnbols  are  used  to  avoid  confusion 
with  standard  notation  (see  A. 25)  and,  at  the  same  time,  to  serve  as  a 
reminder  of  their  origin  (ie,  CSP*"1)  . 

The  principal  measure  of  plan  performance  for  CSP-l  is  the  Fraction 
Inspected  (FI)  functional  which  is  given  in 

Definition  1.  For  a  production  run  of  length  N  and  sampling  plan  CSP-1, 
N 

FI(N;1)  =  1  -  f  C2(j) 

j=0 


where  C2(*)  is  the  characteristic  function  for  state  2  =  uls  and  v  ~  1-f. 

Taking  the  expected  value  of  the  above  functional,  conditioned  by  an 
initial  start  in  sc  (Job  Shop  case) ,  letting  N  approach  infinity ,  and  using 
the  Ergodic  Theorem,  we  have  [6.1  and  6.2] 


AFI(«>;1)  =  I-VP2  («>;!)  (A2) 

where  the  LHS  of  Eq.  A2  is  defined  by 

Lim  Egc[FI(N;l)]. 

N 


1.3  MC  Method  (An  Example).  The  MC  method  will  be  briefly  illustrated  by 
applying  it  to  the  MC  model  of  uls.  This  model  consists  of  two  MC  states: 

SN,  the  non- inspect ion  state  and  SI,  the  inspection  state.  The  transitional 
matrix  of  the  absorbing  MC,  derived  from  the  MC  model  of  any  CSP  having  a  uls 
phase,  is 


SN 

SI 

A 


SN 

V 

qv 

0 


SI 

f 

qf 

0 


A 

0 

p 

1 
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where  A  is  the  only  possible  target  phase  to  be  entered,  The  pev  for  the 
ordered  ensemble  S  =  (SN,  SI)  is  v  =  (v,f)  which  induces  an  initial  pro¬ 
bability  vector  (v,f,0)  for  the  states  (SN,  SI,  A),  where  A  is  the  absorbing 
state,  the  other  two  being  transient.  Thus,  from  Eq.  Al,  we  need  to  derive 
the  expression 

('^>4, A  +  (f>fsi.A  • 


From  the  Chapman-Kolmogorov  equation,  a  difference  equation  for  the  first 
entry  probability  functions  can  be  derived,  z— Transforming  this  difference 
equation,  we  obtain 


Qu1s,a(^)  = 


where  6  =  f p  and  3  =  1^6 , 

In  a  similar  manner,  Qg^  ^(z)  can  be  derived  using  an  (I+l)  x  (I+l) 
transitional  matrix  consisting  of  I  transient  and  one  absorbing  states 
[6.2].  Also,  for  this  latter  function,  see  [6.10,  Chp.  13]  for  a  differ 
ent  derivation  which  is  based  on  renewal  theory  and  Bernoulli  trials. 


1.4  Notation  and  Terminology.  Three  essentially  different  plans  will  be 
studied  in  future  chapters.  They  are  denoted  by  CSP-12,  CSP-13,  and 
CSP-14.  For  ease  in  indexing  functionals,  CSP~1  will  henceforth  be  denoted 
by  CSP-11.  SMC  models  associated  with  the  above  plans  will  be  denoted  by 
SMC(k),  k  a  positive  integer;  in  one  case,  a  Markov  Renewal  Process  (MRP) 
model  is  constructed  for  CSP-12  and  is  denoted  by  MRP (2).  A  MC  state  with¬ 
out  self  transitions  will  be  called  a  trivial  SMC  state;  one  with  self 
transitions  will  sometimes  be  considered  as  a  (non-trivial)  SMC  state  with 
a  geometrically  distributed  holding  time.  A  [functional]  will  usually  mean 
Egc  [functional]  for  the  models  considered.  In  particular,  with  respect 

to  some  other  set  of  models ,  A  [ • ]  could  have  an  entirely  different  defini¬ 
tion.  Theorems,  propositions,  and  definitions  are  numbered  consecutively 
throughout  the  paper.  Statement  y  of  section  x  in  the  Appendix  will  be 
denoted  by  A.xy. 
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1,6  Principal  Results,  For  the  quantities  referenced  below,  6  =  fp, 
g  *=  1-6,  and  v  =  1-f, 

Eq.  A2  gives  AFI(<»;1);  P2(";l),  y’l ,  and  y’2  given  in  Theorem  1. 

Eq.  B8  gives  ATFI(«>;2);  P2(";2)  is  given  in  Theorem  4  and  TFI  is 
defined  in  Definition  2. 

Eq.  C8  gives  ATFI(<»;3);  P2(°°;3)  is  given  by  Eq.  C7. 

Theorems  17  and  20  give  AFR(«;2);  Theorem  18  gives  AFR’(<»;2). 

Theorem  7  compares  ATFI(“;2)  and  AFI(<»;1);  Theorem  14  compares 
ATFI(~;2)  and  ATFI(<»;3). 


2.0  DSI  -  GENERAL.  Having  initially  started  in  the  screening  phase 
(Job  Shop  case),  if  a  defect  is  found  in  the  sampling  phase  at  time  n, 
n  >  I,  Downstream  Inspection  (DSI)  requires  1.)  a  return  to  unit  n-I  with 
100%  inspection  of  the  succeeding  I  units  and  2.)  entrance  to  the  sampling 
phase  (screening  phase)  if  no  (one  or  more)  defects  are  found  upon  com¬ 
pletion  of  1.).  DSI  is  portrayed  in  Figure  1,  Chapter  1. 

2.1  Introduction.  If  the  DSI  stage  is,  for  the  moment,  intuitively 
looked  on  as  a  "pseudophase",  the  Total  Fraction  Inspected  (TFI)  can  be 
obtained  by  treating  it  as  a  modification  of  FI(N;1).  Conceptually,  this 
modification  can  be  broken  down  into  two  separate  parts.  The  first  is  an 
additive  fractional  increase  due  to  a  sum  each  term  of  which,  after 
multiplication  by  N,  is  equal  to  v  min(k,I)  where  k+1  is  the  duration  of 
the  corresponding  sampling  phase  segment.  The  second  is  a  nonlinear  de¬ 
crease  in  FI(N;1)  due  to  the  transitional  requiremtnts  that  come  into  force 
upon  leaving  the  "pseudophase" *  The  decrease  occurs  because,  upon  finding 
a  defect,  there  is  a  chance  of  immediate  (at  least  in  the  sense  of  opera¬ 
tional  time)  return  to  the  sampling  phase  rather  than  an  automatic  entrance 
to  the  screening  phase  which  would  otherwise  take  place  in  CSP-11.  The 
finite  probability  of  this  immediate  return  results  in  a  fractional  increase 
in  units  not  inspected  and,  therefore,  a  corresponding  fractional  decrease 
in  units  inspected. 

These  remarks  lead  to  the  following  proposed  solution.  The  nonlinear 
decrease  can  be  dealt  with  by  weaving  the  transitional  requirements  of  the 
"pseudophase"  into  the  SMC  structure  of  CSP-11  thereby  yielding  a  new  SMC 
and  its  Fraction  Inspected  function,  FI(N;2).  The  additive  increase  can  then 
be  easily  handled  by  coupling  a  new  Incremented  Fraction  Inspected  functional, 
IFI(N;2),  to  FI(N;2).  Adding  these  two  functionals,  we  finally  have 
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Definition  2.  The  Total  Fraction  Inspected  is  given  by 
TFI(N;2)  =  FICN;2)  +  1FI(N;2). 

In  this  chapter,  ATri(»;2)  is  found  and  compared  with  AFI(«>;h), 
h  =  1,4-  In  Chapter  4,  other  functionals  and  SMC  models  are  studied 
since  the  one  considered  here  and  its  transient  version,  treated  in 
Chapter  3,  are  not  complete  measures  of  plan  performance. 

2  2  MRP(2)  and  IFI(«>;2).  For  ATFI(“;2),  the  solution  proposed  in  the 
introduction  suggests  a  model  for  CSP-12  given  in  Figure  3  and  denoted  by 
MRP (2).  This  model  is  a  Markov  Renewal  Process  whose  definition  is  given 
in  A. 19  (also  see  A.  28). 

Figure  3 

Model  for  CSP-12  (MRP (2)) 


l-ql 


Concerning  the  model  in  Figure  3,  we  have 

Theorem  2.  MRP (2)  is  a  MRP.  Letting  sc  =  1  and  uTs  =  2,  the  states 

are 


(l;0l2(z))  and  (2;02i(z).  022(z)) 

where 


Q12(z) 

Q2i(z) 

and 

Q22 (2) 


ql(z-q) 

(2.1) 

(j)(z) 

6(l-ql) 

(2.2) 

z-8 

6ql 

z-6 

(2.3) 
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Proof.  Eq,  2.1  follows  from  the  model  for  CSP-11.  Eqs.  2.2  and 
2,3  follow  from  the  model  for  CSP'^11  and  the  introductory  remarks  to 
Chapter  2  since,  upon  completion  of  a  DSI  segment,  the  sampling  phase 
(screening  phase)  is  entered  with  probability  ql  (probability  l^q^) 
with  operational  time  playing  no  role,  MRP (2)  is  a  MRP  by  definition. 

The  definition  of  FI(N;2)  is  of  the  same  form  as  that  given  for 
FI(N;1)  in  Definition  1.  We  now  define  the  incremental  functional  in 

Definition  4.  Let  W(*)  be  the  following  functional; 


Z 

W(t)  =  E 

S=1 


where  Z  =  N2(t)-1  and  Rg  =  min(k,I)  if  the  s^h  exit  from  state  2  takes 

place  (k+1)  time  units  from  the  stk  entrance.  Then  the  Incremented  Fraction 
Inspected  functional  for  MRP (2)  is 

IFI(t;2)  =  V 


where 


V  -  1-f. 

Filtering  out  state  1  from  MRP (2),  we  obtain  the  pdf  of  the  renewal 
time  for  an  occurrence  of  state  2  which  is  given  by 


Q22(t)  -  Q21*Ql2(t)  +  Q22(t)* 


Thus,  averaging  the  time  over  one  renewal  cycle,  we  have 


E[T]  =  ^  kP[T2i  +  Ti2  =  k  or  T22  =  k] 
k=l 


=  S  k  Q22(k) 

k 
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=  2  k  Q2i*Qi2(.k)  +  2]  k  Q22(k). 
k  k 

From  the  mean  value  property  of  the  z-transform,  we  must  evaluate 

“ZD2(Q2iQi2)  ''2:D2(Q22) 

at  z  =  1,  Calling  the  results  of  the  evaluation  mj  and  m2,  respectively, 
we  have 

J 

nil  =  6  (P’l  +  p!2  )  and  m2  =  g 

where 

■5  =  6(l-ql). 

Proposition  1> 

E[T]  =  (l-ql)p'i  +  p'2  (Bl) 

Proof.  See  above. 

Averaging  W(*)  over  one  cycle  yields 
I 

E[W]  =  ^  kP[W=k] 
k=l 

=  y  k63^  +  I 
k  j=0 


428 


Since 


substituting  the  RHS  of  this  equation  for  the  RHS  above  and  simplifying, 
we  have 


Proposition  2. 


EtW]  =1  (I-3I) 


(B2) 


Proof.  From  above . 

We  are  now  ready  to  prove 
Theorem  3.  For  MRP (2), 


IFI(»;2)  =  vB(l 


+  p'2 


[a.e. ] . 

Proof.  By  the  Strong  Renewal  Theorem  [6.7,  6.9],  we  have 

Lim  ^  EM  ^  1 

N  E[T]  ’  ta.e.j. 


The  theorem  follows  from  this  result.  Props.  1  and  2,  Def.  3,  and 
simplification. 


Corollar'' 


(«>;2)  =  - — ) 

\(l-q^)Pl  +  Vz  ' 


Proof. 

AIFI(c=;2) 


Lim  Esc[W(N)] 
N 


429 


2  3  SMC^Z)  and  IFI(«>;2).  By  its  very  definition  the  functional  W(*) 
depends  on  the  sample  paths  of  a  MRP,  including  the  self-transitions  of 
a  component  state.  In  particular,  the  fundamental  probability  functions 
(see  A. 12)  of  the  induced  SMC  (see  A. 28)  are  not  sufficient  to  describe 
W  since  they  don't  record  the  self  transitions  of  uls .  However,  just  as 
MRP(2)  induces  a  unique  SMC(2),  W  also  induces  a  correspondingly  unique 
functional,  W*(0,  defined  on  the  chain.  We  first  prove 

Theorem  4,  MRP (2)  induces  a  unique  SMC,  denoted  by  SMC (2). 

Proof.  From  A. 28,  SMC (2)  can  be  defined  via  its  pdf's  as  follows 
[cf.,“634] 

Qf2<2)  =  Qi2(z) 

and 

Q2i(^)  =  |E  ^22 

*  3=0 


=  Q2i/(1"Q22) 


"-^Az-'s^ql)) 


6 

z-P 


(4.2) 


Recalling  the  definitions  of  Pi  and  U2  Theorem  1,  we  have,  from 

the  derivatives  of  Eqs.  4.1  and  4.2, 

PI  =  p'l  and  U2  = 

6 

where 

I  =  6(l-ql). 
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andP(«>;2)=- - -  (B3) 

where  1  and  2  on  the  LHS's  are  sc  and  uTs*,  respectively. 

The  transitional  matrix  and  stationary  vector  of  SMC (2)  are  the 
same  as  in  Theorem  1  which  finishes  the  proof. 

Our  goals  now  are  to  find  E[W  ]  and  E[T  ]•  To  this  end,  we  prove 

Theorem  5.  The  functional  W(*)  induces  a  well  defined  functional, 
W*(-),  on  SMC(2). 

Proof.  W*  is  implicitly  defined  through  the  following  equations. 
Conditioning  on  the  number  of  self  transitions,  j,  of  uls ,  we  have 

00 

P[w*=k]  =  5^  P[W*=k|j]P[j] 
j=0 


-E  •iO'* 

j 


From  A. 25,  we  thus  obtain 


Pl(“;2)  = 


(l-^I)p\ 


(l--ql)p\  +vi^ 


where 

aj (k)  =  P[W*=k  and  j  repetitions]. 

Noting  that  a^ (k)  can  be  defined  in  terms  of  agCk),  s  <  j,  we  can  derive 
a  set  of  equations  relating  the  above  a’s.  For  ease  in  notation,  we  first 
define 


431 


Then,  for  0  rS  k  S  (j+i)I,  k  a  fixed  integer,  we  obtain  the  system  given 


below. 


ij(k)  =  (5q^)aj_j^*B(k)  +  (Bq)^aj(k-I) 


C5.1) 


where 


I  <  k  <  jl, 

-  (6q^)aj_]^*B(k) 


(5.2) 


where 


0  <  k  <  I, 


(5.3) 


where 


jl  <  k  <  (j+l)l,  and 
=  (Bq’^)aj(jl) 


where 


k  =  (j+i)i. 


From  Eq.  5.1,  we  have 
jl  ^ 


.l*B(k) 

_k 


(5.4) 


(eq)^  2 


E.  a,_i(k-I) 


=  X  +  y 


(5.5) 
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Adding  zero  on  the  RHS  of  Eq,  5,5  and  changing  indices  in  the  term  Y, 
we  have 


X  +  Y  =  X  +  ^  ^ 

s=0 


where 

R  =  (<SqI)  2 
k=0 


a^_]^*B(k) 


Grouping  one  R  with  X,  using  Eq.  5.2  to  transform  the  second  R,  recalling 
that  for  j-1,  0  <  k  <  jl,  and  using  the  definition  and  convolutional 
property  of  the  z-transform  [6.3,  6.12],  we  have 


RHS  (5. 5)  =  CX+R)  +  Y-R 

=  (6q^)aj_j^(z)B(z)  +  Y 


Eaj  (R) 


(5.6) 


Again,  using  the  definition  of  the  z-transform,  noting  that  in  Y  the 
sum  is  from  0  to  (j-1) I  while  on  the  LHS(5.5)  the  sum  varies  from  I  to 
jl,  and  adding  the  last  term  of  Eq.  5.6  to  the  LHS  of  Eq.  5.5,  we  obtain 


s 


z® 


(LHS(5.5)  +  R  +  Sj)  -  Sj , 


where  s  varies  from  jI+1  to  (j+l)I, 


=  (X+R)  +  (Y+S'.  ,)  -  S'  , 
J-1  J-1 


=  (X4-R)  + 


(5.7) 
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where 


s  varying  from  (j-l)!  +  1  to  jl. 

From  Eqs.  5.3  and  5.4,  we  find  that 


Thus,  the  Sj  term  cancels  out  in  Eq.  5.7  leaving  us  with  the  final  equation 
aj(z)  =  (6qI)aj_i(z)B(z)  +  (B4) 

Eq.  B4  can  now  be  solved  iteratively,  if  desired,  thereby  proving 
Theorem  5. 

W*  can  also  be  explicitly  defined  in  the  same  way  as  W  (except  that 
R*  can  vary  from  zero  to  infinity) .  The  importance  of  Theorem  5  is  its 
use  in  Proposition  4. 

Proposition  3.  Let  sc  =  1  and  uTs*  =  2.  Then,  we  have  for  the 
renewal  time,  T  ,  for  state  2, 


E[T*] 


(l-ql)y\  +  y'2 
(1-qI) 


(B5) 


Proof. 

00 

E[T*]  =  kP[T*=k] 

k=0 


=  2  ^^21*  Ql2(k) 
k 
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V 


-  "^zD2(Q2iQi2)  >  at  2  -  1. 

Evaluating  the  last  expression,  we  have  the  result. 

Proposition  4.  Averaging  W  over  one  renewal  of  state  uls  ,  we  have 


E[W*]  =  ^  (B6) 

6(l-ql) 

Proof.  The  renewal  time  is  given  by  T  in  Proposition  3  and  has  pdf 

Q21*Qi2* 

Summing  aj (z) ,  in  Theorem  5,  from  one  to  infinity,  we  have  from 
Eq.  B4 

A(z)-aQ(z)  =  (6qI)A(z)g(z)  +(^J-  A(z)  (B7) 


where  oo 

A(z)  =  ^ 

3=0 


zJ 


From  the  mean  value  property  of  the  z-'transform  and  the  definition 
of  aj (k) ,  we  thus  have 

E[W*]  =  -zD2A(z)  (at  z=l) . 

The  proof  is  finished  by  evaluating  the  RHS  of  this  last  equation 
and  simplifying. 

We  are  now  ready  to  prove  the  analogue  of  Theorem  3  (where  the  IFI 
functional  is  considered  to  be  a  quantity  dependent  on  the  plan  but 
evaluated  on  the  model)  in 

Theorem  6.  For  SMC (2), 

IFI(«;2)  =  vB(1-B^)P2(“;2),  [a.e.] 
where  2  =  "uls*  and  P2(";2)  is  given  in  Theorem  4. 
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Proof ,  Again  by  the  Strong  Renewal  (or  Ergodic)  Theorem, 


Llm  W*(N)  ^  E[W*] 

N  E[T*]  ’ 


6(i-qi)  (i-q^)hi  + 


from  Eqs.  B5  and  B6, 


y*2 _ 

(l-q^)y'l  +  y'2 


=  e(l-6^)P2(“;2), 

from  Eq .  B3 . 

Multiplying  by  v  finishes  the  proof. 

In  particular,  the  equations  in  Theorems  3  and  6  agree,  as  they 
should. 

Corollary. 

AIFI(«>;2)  =  ve(l-3^)P2(“;2) 

Proof.  The  same  as  in  the  Corollary  to  Theorem  3. 


2.4  TFI(~;2)  and  Comparisons.  Given  the  real  number  p  varying  over  the 

open  unit  interval,  the  inequality  "1-qI  <  1",  Theorem  1,  and  Theorem  4 
imply 

for  SMC(l)  and  SMC(2).  We  shall  show  a  similar  result  for  AFI(”;1)  and 
ATFI(“>;2).  Before  doing  this,  we  record  the  following  result 

ATFI(«>;2)  =  AFI(®;2)  +  AIFI(«>;2) 

=  (1-vP2(“;2))  +  v6(l-e^)P2(”;2) 
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=  l-vP2(“;2)(6  +  ei+l) 


(B8) 


Theorem  7,  For  p  in  the  open  interval  0  <  p  <  1, 

ATFI(co;2)  >  AFI(«>;1)  iff  >  qVj 

where  a\  =  +  Vi’2)- 

Proof.  From  Eqs.  A2  and  B8,  the  statement  is  equivalent  to 

P2(«>;i)  >  P2(“;2)(6  +  3^+1). 

This  inequality  is,  in  turn,  equivalent  to 
(6  +  +  p'2)  <  (l-ql)p'i  +  p'2 

=  (w'l  +  P'2)  -  q^p'i 

Dividing  through  by  (]i\  +  ^2)  »  we  have 
(6  4-  <  1-qM 

or 

l-(6  +  >  q^a'i. 

However , 

l-(6  +  =  B(l-Bl). 

Thus , 


3(1-8^)  >  qVi 

which  finishes  the  proof. 

For  p  =  0  or  1,  the  formulas  in  Theorem  7  are  equal. 

Another  type  of  CSP,  denoted  here  by  CSP-14,  is  the  plan  obtained  from 
replacement  of  DSI  in  CSP-11  by  USI.  For  CSP-14,  the  SMC  model  is  straight¬ 
forward  since  the  limited  inspection  scheme  runs  with  the  natural  flow  of 
operational  time.  For  this  model,  we  have 

Proposition  5.  Letting  sc  =  1,  uls  =  2,  and  ck  (or  USI)  =  3, 
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■H2 _ ^ _ 

(l-ql)pi  +  V2  +I 


■  Proof.  If  e  is  the  stationary  vector,  using  the  SMC  model  for  the 
ck  phase  found  in  [6.2],  we  have 

(l-q3)e  =  (1-q^,  1,  !)• 

The  rest  of  the  proof  easily  follows  from  A. 25  given  that  ps  =  I. 

It  clearly  follows  from  Proposition  5  that 
AFI(“;4)  =  1-vP2(“;4) 

Thus,  to  compare  AFI(»;4)  and  ATFI(«>:2),  it  would  suffice  to  compare  the 
expressions  which  are  analogous  to  those  in  Theorem  7.  However,  to  avoid 
a  long  proof,  it  also  suffices  to  give  the  following  probabilistic  argument. 

Upon  finding  a  defect  in  the  sampling  phase,  I  new  units  are  inspected 
with  CSP-14  while,  on  the  other  hand,  at  most  vl  new  units  are  inspected 
under  CSP-12.  Since  the  transitional  probabilities  are  the  same  from  the 
limited  inspection  (pseudo)  phase  in  both  plans,  the  proof  is  finished. 


3.0  DSI  -  TRANSIENT.  Two  interpretations  of  DSI  for  the  transient  case 
are  treated  in  this  chapter.  The  first  version  is  the  transient  case  of 
DSI,  already  dealt  with  in  Chapter  2  for  Infinite  N.  That  is,  DSI  is 
applied  to  both  phases  of  CSP-11  with  constant  "pseudophase"  transitional 
probabilities.  In  contrast  to  the  first  version,  the  second  plans  pseudo¬ 
phase"  transitional  probabilities  to  sc  (or  uls)  monotonically  Increase 
(or  decrease)  with  increasing  duration  in  the  sampling  phase,  until 
truncated  by  1-q^  (or  q^) .  One  can  infer  from  this  monotonicity  that  DSI 
is  applied  only  to  the  sampling  phase  in  the  following  sense.  If  a  defect 
is  found  during  a  sampling  segment,  k  +  1  time  units  from  entrance  to  this 
particular  segment,  then  only  the  previous  t  units  are  to  be  inspected, 
where  t  =  min(k,  I).  Upon  completion  of  this  modified  DSI,  uls  is  entered 
if  no  defects  are  found  (with  probability  q^^) ;  otherwise,  sc  is  entered 
(with  probability  1-q^) . 

3.1  Introduction.  The  analysis  of  each  version  involves  three  stages. 
However,  for  convenience  in  the  final  section,  a  fourth  stage  is  added  for 
the  second  version. 
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In  the  primary  stage,  the  modified  sampling  phase  is  partitioned 
into  1+2  SMC  states  which  are  consecutively  labelled  0  through  I 
and  b.  The  purpose  of  this  splitting  is  the  derivation  of  an  expression 
for  the  monotonically  increasing  portion  of  the  functional  W( •) • 

In  the  secondary  and  tertiary  stages,  SMC  states  1  through  I  are 
recombined  into  a  preliminary  macrostate,  c’;  it,  in  turn,  is  combined  with 
SMC  state  0  to  form  the  final  SMC  state,  c.  The  purpose  of  these  latter 
two  manipulations  is  to  facilitate  the  derivation  of  an  expression  for 
the  truncated  portion  of  W(-)  by  avoiding  complex  sums  of  products  of 
characteristic  functions. 

The  chapter  concludes  with  a  comparison  between  the  TFI  functionail 
of  each  version  for  infinite  N  (or  t) . 

3.2  Strict  DSI.  In  order  to  analyze  the  transient  case  of  DSI,  the  SMC 
model,  shown  in  Figure  4,  is  used.  It  is  denoted  by  SMC (3). 

Figure  4 

Model  for  CSP-12  (SMC (3)) 


sc  =  a 


6  =  6(l-q^),  r  =  l-6q^,  6*=  6/r,  and  g’  =  6/r 


Concerning  this  model,  we  have 

Theorem  8.  SMC (3)  is  an  irreducible  SMC. 


439 


Proof.  The  z- transformed  pdf*s  of  the  states  making  up  SMC(3) , 
together  with  their  corresponding  transitional  probabilities  in  the 
embedded  MC,  are  given  below. 

Qk,k+l(z)  =  e/z,  qk,k+l  =  e,  for  1  <  k  <  I-l 


Qka(z) 

=  5/z,  qjj-a  = 

6, 

for  1  <  k  <  I 

Oko(z) 

=  6ql/z,  q^o 

=  6q^,  for  1  ^  k  ^  I 

=  6/ (z-6q^) , 

•loa 

=  6’ 

=  3/(z-6q^), 

*101 

=  3' 

=  ql(z-q)/(j)(! 

0, 

^aO  "  ^ 

^ba^^) 

=  6/(z-3),  q^^  = 

1-q^ 

Qb0^^> 

=  6q^/(z-3) , 

= 

The  equations  follow  from  SMC (2)  by  observing  that  uls* ,  since  its  holding 
time  pdf  is  geometric,  can_be  regarded  as  a  MC  state  which  jumps  to  itself 
and  sc  with  probabilities  3  and  6,  respectively. 

Ordering  the  states  of  SMC (3)  in  the  same  manner,  from  left  to  right, 
as  they  are  ordered  in  Figure  4,  we  obtain  the  linear  system  of  equations 
from  the  matrix  equation  £  =  eT,  T  the  embedded  MC  transitional  matrix. 


6'eQ  +  6  +  (l-q^)e|j  = 

j=l 


(8.1) 


®a  +  ql 


®b  ■ 


(8.2) 


e'eo  =  ei 
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for  1  i  k  s  I-l, 


^®k  ®ic+l 

and 

Bej  =  eij 

From  this  system,  exclusive  of  Eqs.  8.1  and  8.2,  we  obtain 


®k  =  1  <  k  <  I 

and 

eb  =  B^B’eg 

Eqs.  8.1,  8.3,  and  8.4  imply 

6’ eg  +  B'  (l-q^)(l-B^)  eo  +  (l-ql)eo  =  e^ 
or 


(8.3) 

(8.4) 


(8.5) 


Since  e  is  normalized,  we  have  from  the  sum  of  its  components,  Eqs.  8.3, 
8.4,  and  8.5 


_  6(l-6ql) 
®0  -  G 


(8.6) 


where 

G  =  (l+6)(l-6ql)- 

Thus,  Eqs.  8.3,  8.4,  8.5,  and  8.6  imply 


e  -  g(i-q^)  e  -  g(i-sq^) 
a  G  ’  0  G 
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! 


®b  “  G 


.  53k 

=  “g" 


where  1  s  k  ^  I. 

Differentiating  the  Q's,  multiplying  through  by  minus  one,  and 
evaluating  the  results  at  z  =  1,  we  have  (adding  terms  where  appropriate) 


.  =  Ozail  . 

Y  *  ^0  i_6ql 

Ub  =  i  ,  and  =  1 
for  1  <  k  <  I. 

i 

This  finishes  Theorem  8. 

Corollary.  For  SMC(3), 

I 

Ojj  +  Oj  +  Ob  =  P2(“;2)  (8-7) 

3=1 

where  the  LHS  refers  to  SMC  (3)  and  the  RHS  refers  to  SMC (2). 
Proof. 

I 

^0®0  ^  ^k®k  Pb®b  “  (<5+3) /G 

k=l 

=  1/G 

PaBa  +  1/G  =  (^)  (I) 

1 

GP2(»;2) 


Thus 
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LHS  (8.7)  =  GP2(“;2)(1/G) 

=  P2(“;2) 

Relative  to  SMC (3),  we  have 

Definition  5.  The  monotonically  increasing  portion  of  W(t),  divided 
by  t,  and  considered  as  being  defined  on  SMC (3)  is 

t-1  I 

S  X;kC^(n)(l-Cj^+l(n+l)) 

W'  (t)  ^  n=0  k=l _ 

t  t 


Thus  we  can  also  write 

IFI'  (t;2)  =  V  . 

Operating  on  this  equation  and  the  RHS  of  the  equation  in  Definition  5 
with  Egc[*]»  we  obtain 

t-1  I 

AIFI*(t;2)  =  6  -  (Cl) 


which  can  be  evaluated  by  using  the  z-transformed  Backward  Equations  for 
SMC(3);  see  [6.1]  for  an  example  of  such  an  evaluation.  Letting  t  approach 
infinity,  we  have 


I 

AIFI'  (“;2)  =  <5  ^  kajj,  (C2) 

k=l 


Since,  from  the  last  part  of  Theorem  8  and  from  A. 25 

“k-  (^) 

I 

=  68l«^P2(“;2) 
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and 


%  =  ^  (ub)(GP2(“;2)) 


*  gI+lp^(oo;2) 
we  have,  from  Eq.  C2, 


6  2  kafe  =  e62p^(„.2)  Dp 

k 

=  B(1-3^-6IbI)P2(~;2) 

From  A.  27, 

W"  (h)oL^  =  6Ia^ 

where  (t)  is  the  constant  part  of  W(t) .  Therefore,  adding  the  last 
two  expressions  and  performing  the  indicated  operations,  we  have 


Lim  E[W(t)  ] 
t-x»  t 


B(l-3i)P2(<»;2), 


a  result  which  agrees  with  that  obtained  in  Chapter  2. 

In  order  to  deal  with  the  constant  part  of  the  functional  for 
finite  t,  we  proceed  to  reduce  SMC (3)  to  a  more  manageable  model  as 
described  in  Section  3.1. 

Stage  two  consists  in  filtering  out  the  states  1  through  I  in  SMC(3), 
an  operation  which  leads  to  a  new  model:  SMC (4).  The  details  and  results 
of  collapsing  SMC (3)  into  SMC (4)  are  given  in 

Theorem  9.  SMC (4)  is  an  irreducible  SMC  obtained  from  SMC (3). 

Proof.  Let  c’  be  the  ordered  ensemble  composed  of  the  states  1 
through  I.  Noting  that  the  pev  of  c’  is  (1,0,0, - ,0),  I-l  zeros,  we 
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apply  combinatorial  analysis  to  get  (dropping  the  argument  z) 


Q^'o  ^12^20+  —  +  Q12Q23 - QlO 


=  ^ 
z-e 


(9.1) 


In  the  same  way,  we  also  obtain 


c'a 


(9.2) 


(9.3) 


The  remaining  results  concerning  SMC (4)  can  be  easily  derived 
from  the  above  equations.  In  particular,  see  A. 29. 

Corollary. 

SMC (4)  <  SMC (3) 

where  "<"  is  the  filtration  ordering  relation. 

Proof.  SMC(4)  is  a  filtration  of  SMC(3)  by  the  proof  of  Theorem  9 
and  A. 29. 

Stage  three  consists  in  filtering  out  state  c'  in  SMC  (4)  yielding 
SMC (5).  The  details  and  results  are  given  in 

Theorem  10.  Filtering  out  c'  in  SMC (4)  yields  a  new  SMC,  denoted  by 
SMC(5). 
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Proof,  Let  the  ordered  ensemble  (0,c^)  be  denoted  by  c 
the  pev  for  c  is  the  vector  (1,0), 

First  construction.  Applying  combinatorial  analysis  to 
formed  pdf’s  in  Theorem  9  (Eqs.  9.1,  9.2,  and  9.3),  we  have 


<Qoc'‘^c'0 


^Qoc'Vo 


Qoa 


c  a 


Qoa  +  Qoc*  Qc’  a 


1-Qoc’Q 


^c’  0 


c(z) 


where 

c(z)  =  +  <Sq^))  +  63(q6)^  • 


(C3) 


Similarly, 

Qcb  " 


Qqc’  ^c’b 
l"Qoc’  Qc’  0 


Then, 

the  trans- 
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c(z) 


(C4) 


Second  construction.  Since  SMC (5)  is  the  model  to  be  used  in 
deriving  an  expression  for  the  constant  part  of  IFI(t;2),  t  finite,  we 
will  sketch  the  more  elaborate  SMC  method.  The  relevant  absorbing  SMC 
has  transient  states  0  and  c' ;  absorbing  states  a  and  b.  Using  A. 21, 
setting  a  =  A,  b  =  B,  and  c'  =1,  we  obtain  the  following  transformed 
Backward  Equations  (four  others,  not  needed,  are  omitted). 


Pqa  “  ^OlPlA  Qoa^AA 
^lA  “  ^lO^OA  ^lA^AA 


^OB  ” 


^IB  ^lO^OB  ^IB^BB 


^BB  “  ^0 


Solving  for  in  the  first  set  of  three. 


HqCQqa  +  QoiQia> 
i-QoiQio 


Since  the  pev  of  the  ordered  ensemble  (0,1)  is  (1,0),  the  above  equation, 
Eq.  Al,  A. 13,  and  A. 22  imply 

A  A 

^ca  '  ^OA 


0 
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=  Eq,  C3 


Solving  for  Pqb  in  the  second  set  of  three, 


P 


OB 


HqCQqiQib) 

1-QoiQio 


Again,  since  the  pev  =  (1,0),  the  above  equation  together  with  Eq.  Al, 
A. 13  and  A. 22  imply 

Qcb  =  Fob 


yv  ,  /s 

=  Pob/Ho 


=  Eq.  C4. 

SMC (5)  has  three  states:  a,  c,  and  b.  The  transformed  pdf’s  for 
transitions  of  a  to  c,  b  to  c,  and  b  to  a  are  the  same  as  those  for  a  to  0, 
b  to  0,  and  b  to  a,  respectively,  in  SMC(4). 

We  finish  the  proof  of  Theorem  10  by  remarking  that  states  a  and  c 
cannot  be  combined  since  a  pev  (from  state  b)  does  not  exist. 

Corollary. 

SMC (5)  <  SMC  (4). 

Proof.  Construction  of  the  state  c  in  SMC (5)  is  equivalent  to 
filtering  out  state  c’  in  SMC (4).  SMC  (5)  is  an  irreducible  SMC  by  A. 29. 

We  can  now  derive  an  expression  for  the  constant  part  of  IFI(t;2)  in 

Theorem  11.  Given  the  3  state  model,  SMC (5), 

IFI”  (t;2)  =  vl  |Nb(t)-Cb(t)  I  ((,5) 


Proof.  Nb(t)  gives  the  number  of  entrances  to  state  b  by  time  t. 
The  number  of  exits  from  state  b  is  clearly  N|j(t)-C|j(t) ,  the  second  term 
being  the  characteristic  function  of  state  b. 
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Corollary, 


AIFI"  (t;2)  =  vl 


(Ea[Nb(t)] 

I  t 


Pab(t) 

t 


(C6) 


Proof.  Apply  E^[0  to  Eq.  C5, 

In  order  to  use  Eq.  C6,  we  must  be  able  to  develop  a  useable 
expression  for  the  mean  of  the  renewal  function.  Towards  this  end, 
we  prove 

Proposition  6.  Let  N(t)  be  a  renewal  process.  Then 

(00 

y]  f(j)  \ 

j=0  ' 

where  F  is  the  renewal  pdf. 

Proof . 


P[N(t)  =  n]  -  P[U(n+l)  >  t]  P[U(n)  >  t] 
=  Ho*F<n+l)(t)  _  Ho*F(n)(t) 


=  Pn(t) 


Thus, 

Pn(z)  =  Ho(F)>"(l-F) 
Therefore, 


t 

sn 

n=0 
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sfip  (1-^) 

(S"t) 


=  P(z,s) 


From  the  last  function,  we  have 


.3  (3t  S  =  1)  =  -^ 


The  LHS  is  the  transform  of  the  mean  and  we  are  done. 
Corollary  1. 


Ea[Nb(t)]  ^  Ho*Fab*(l-Fbb)~^(t) 
t  t 


where  the  inverse  expression  is  shorthand  for  the  summation. 

Proof.  Renewals  of  state  b,  starting  in  state  a,  form  a  delayed 
renewal  process  with  initial  probability  function  Then  Proposition  6 

finishes  the  proof. 

Corollary  2. 

Lim  Ea[Nb(t)] 

-  = 

t-x»  t  D 

Proof.  From  Corollary  1  above,  we  have 

Lim  Ea[Nb(t)]  Lim  Ho*S(t) 

t-x”  t  t-^  t 


Lim  S(t) 
t-Xo  t 
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where  S(t)  =  (t) , 


Lim  / z-l\  ^ab 

z+1  \  z )  a-hh) 


-zDzFbb(z) 


(at  z  =  1) 


=  6ab. 

The  second  equality  follows  from  the  simple  argument  that  if  S(') 
is  a  sequence  with  limit  A,  then  the  Cesaro  limit  of  S(*)  also  exists 
and  is  equal  to  A. 

From  the  second  corollary  to  Proposition  6,  we  have  in  addition 


Ea[Nb(t)]  Pab(t) 
t  ’  t 

as  t  approaches  infinity,  since  the  second  term  goes  to  zero. 

The  main  results  about  IFI(t;2)  are  summed  up  in 
Theorem  12.  For  the  transient  case  of  CSP-12,  we  have 
AIFI(t;2)  =  AIFr (t;2)  +  AIFI"  (t;2) 

where  the  first  and  second  terms  on  the  RHS  are  evaluated  using  SMC(h), 
h  =  3  and  5,  respectively. 
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Proof.  Combine  Eqs.  C2  and  C5  (taking  the  limit,  we  get  v  times 
the  result  using  W’  and  W"  ). 

When  t  is  finite,  in  order  to  compute  Ea[Ni)(t)],  we  need  to  know 
Fab(t)  and  Fbb(t).  Since  SMC(5)  has  3  states,  we  have  9  Backward 
Equations ,  only  one  of  which  is  needed  for  the  mean  value  of  the  above 
renewal  function.  The  following  statements  sketch  the  results. 

From  Theorem  10,  A2.1,  and  A1.4,  we  have 

^bb  ~  Qbc^cb  Qba^ab  '^b 
This  equation  is  equivalent  to 


or 

1  -  ^  =  Oba^ab  +  Qbc^cb 
^bb 


But,  LHS  ==  Fbb-  Therefore, 
Fbb  “  Qba^ab  Qbc^cb* 


From  Theorem  10, 


6q  ^ 

Qbc  =  Qba  = 


5(l-q^) 

z-e 


Applying  combinatorial  analysis  to  the  transformed  pdf's  of  SMC(5) , 
we  have 


^cb 


I  (QcaQac^^  1  Qcb 

i  ' 
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and 


"  Qcb/  (^""QacQca) 


Fab 


=  {?  } 


ac^cb 


^ac^cb^  (^'"^ac^ca 


) 


« 


From  these  equations,  E[Ntj(t)]/t  can  be  computed  [cf,,  6.1]. 

The  use  of  SMC (3)  suggests  the  following  alternative  treatment  of 
CSP--12.  Instead  of  splitting  uls*  into  1+2  states,  we  split  it  into 
an  infinite  number  by  splitting  state  b  into  the  states  b(j),  l<j^ 

The  resulting  model,  SMC (6),  consists  of  two  nontrivial  SMC  states 
(a  and  0)  and  an  infinite  number  of  trivial  SMC(ie,  MC)  states  (1  through  I 
and  the  b(j)’s).  For  the  long  run  case,  we  can  obtain  AIFI(‘»;2)  via  the 
transient  case  as  shown  in 

Proposition  7.  SMC (6)  is  an  infinite  state,  irreducible,  and 
positive  recurrent  SMC.  The  result  for  IFI(t;2)  for  SMC(6)  is  the  same 
as  previous  results. 

Proof.  For  b(j)  ,  1  ^  j  ^  we  have 

«b(j)  =  ^2 (“5 2)  (7.1) 


and 

»'b(j)  =  1- 

Thus  ^b(j)b(j)  “  ^^^b(j)  finite,  proving  the  chain  positive 

recurrent . 

For  the  functional,  it  suffices  to  deal  with  the  part  defined  on 
the  b(j)’s,  W’’* . 

t^l  «> 

S  E  Cb(j)(n)(l-Cb(j+i)(n+l)) 

W”'  (t)  ^  n=0  i=0  _ _ 

t  t 
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Taking  the  mean  value,  conditioned  by  an  initial  entrance  from 
state  a, 


(t)] 

_ 


I  I  «?ab(j)(n) 

n  j 


t 


which,  as  t  approaches  infinity,  approaches 


52gl+lp^(„.2)  gj 

j=0 


by  Eq.  7,1  and  the  Lebesque  Dominated  Convergence  Theorem  (for 
sequences). 

Proposition  8.  The  models  used  for  CSP-12  are  ordered,  w.o. 
filtration,  as  follows. 

SMC (2)  <  SMC (5)  <  SMC (4)  <  SMC (3) 


and 


SMC (5)  <  SMC (6) 

Proof.  Corollaries  to  Theorems  9  and  10  imply  the  first  ordering. 

By  filtering  out  states  b(j),  j  ^  2,  we  get  the  second  ordering. 

If  we  split  state  a  into  its  component  MC  states  and  state  0  into 
a  MC  state  in  SMC(6),  we  get  (S)MC(7)  >  SMC(5),  SMC(3).  If  we  instead 
split  a  and  0  as  before  but  now  split  b  by  treating  it  as  a  MC  state,  we 
get  (S)MC(8)  >  SMC(5),  SMC(3),  Clearly,  (S)MC(7)>(S)MC(8) .  MC(8)  can  be 

thought  of  as  a  finite  state  MC  model  which  fills  the  role  of  the  initial 
MC  model  described  in  the  introduction  to  Chapter  1,  though  the  construc¬ 
tion  is  backwards  from  that  description. 
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3,3  Liberal  DSI,  To  obtain  a  more  liberal  DSI,  we  alter  the 
following  transformed  pdf’s  for  states  0  through  I-l  in 

Theorem  13,  The  DSI  sampling  plan  C^P-13  is  obtained  from  the 
SMC (3)  model  of  CSP'-12.  The  result,  SMC (3),  is  an  irreducible  SMC. 

Proof.  The  appropriate  quantities  and  properties  are  given 
below, 

^Oa  ^Oa  ^  ^ 

Qoi  =  3/(z-6),  qQi  =  1 

^ka  “  qi^a  “  ^ 

Qko  =  6qVz,  q^o  =  6^^ 
where  1  <  k  <  I-"! 

The  other  transformed  pdf’s  remain  the  same  as  those  for  SMC (3) 

Ordering  the  states  a,  0,  1,  - ,  I,  and  b,  we  obtain,  from  the 

stationary  vector  equation,  the  system  of  equations  now  given. 


J 

S  ^  (l-qj)ej  +  (l-ql)eb  =63  (13.1) 

j=l 

ea  +  5  *1^%  ~  (13.2) 

3 

=  ejj  (13.3) 

where  1  1  k  ^  I,  and 

=  6],  (13.4) 
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From  Eqs,  13.2,  13.3,  and  13.4,  we  get 


e 


a 


p(l^(gq)I) 

l-(6q) 


(13,5) 


Since  the  components  of  ^  are  normalized,  we  obtain,  together  with 
Eq.  13.5, 


^  _  6(l-gq) 

®0 


(13.6) 


where 

G  =  6p(l-(gq)I)  +  (1-gq) (l+6-6^+l) 
Eqs.  13.4,  13.5,  and  13.6  imply 

_  6p(l-(3q)^) 

G 


^  6g^-l(l-3q) 
^  G 


1  <  k  <  I 


and 

^  6gI(l-Bq) 

G 

Similarly,  from  the  derivatives  of  the  transformed  pdf*s,  we  obtain 

1-q^  1  1  j  1 

l^a  =  .  V^O  =  f  »  •'b  =  -e’  =  1 

where  1  ^  k  ^  I. 

3.4  Comparison  of  CSP^12  and  CSPr,13.  In  the  equations  to  be  derived 
in  this  section,  P2(“;3)  is  the  long  run  percentage  of  time  spent  in 
state  b=2  in  the  three  stage  reduction  of  SMC(3)  to  SMC(2)  which  is  the 
analogue  of  SMC  (2)  for  CSP-13,  directly  obtained 
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from  SMC (3)  filtering  out  the  states  0  through  I  and  b,  again 
yielding  SMC (2),  This  latter  filtration  is  equivalent  to  the  SMC 
method  applied  to  the  ordered  ensemble  (0,1,  •••,  I,  b) ,  with 
pev  =  (1,  0,  •••,  0),  I+l  zeros,  to  obtain  the  two  state  model 
for  CSP-13, 

Given  the  stationary  vector  components  and  the  state  mean  time 
values,  from  Theorem  13,  we  get  the  a’s  for  CSPt-13* 

Ok  =  6e^>2(“;3),  1  <  k  £  I  (13.6) 


and 

ai  =  3l+lP2(~;3) 

where 


P,(“;3)  = 


_ U*2 

p6(l^(6q)I 

(l^Pq) 


(13.7) 


(C7) 


(1  =  a  and  2  =  b) . 

Applying  the  Ergodic  Theorem  and  Eqs.  13.6  and  13.7  to  the 
functional  W(t),  defined  as  SMC (2),  yields 

^ 

1 


+  613^+1^2 

=  3(1-3^)P2<“;3), 
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Upon  taking  the  limit,  the  definition  of  IFlCtjS),  analogous  to 
Definition  4,  gives 

AIF1(«>;3)  =  v3C1^5^)P2C”;3). 

Adding  AFI(";3)  to  the  above  leads  to  the  final  equation 
ATFI(»;3)  =  l-vP2(“;3)(6+e^+^)  (C8) 

With  regard  to  the  last  equation,  we  have 
Theorem  14.  For  p  in  the  open  unit  interval, 

ATFI(»;3)  <  ATFI(“;2). 

Proof.  The  statement  is  equivalent  to 
P2(“;3)  > 

which  is  implied  by 

P6(l-(gq)^)  <  ^ 

1-Pq 

Dividing  both  sides  by  p  and  using  the  theorem  on  geometric  sums,  the 
above  inequality  is  equivalent  to 


3 


(»E 


I-l 


<  |i+  21  ) 


or 

g[l+Sj]  <  [I+S2] 

But  B  <  1  and  gsj  <  S2,  for  p  between  zero  and  one.  The  cases  for  p  =  0 
and  p  =  1  lead  trivially  to  the  same  formulas. 


To  handle  the  transient  case  of  CSP-13,  SMC(3)  is  used  for  the  in¬ 
creasing  part  of  W.  The  constant  part  of  W  is  handled  in  the  same  way 
as  the  corresponding  constant  part  of  W  is  handled  for  CSP-12.  That  is. 
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SMC(3)  collapsed  (s>X  filteired)  to  SMC (4)  which  in  turn  is  collapsed 
to  SMC  (5).  This  analogous  two  stage  process  for  CSP^t^IS  is  briefly 
given  in 

Theorem  15,  For  CSP^^IS,  filtration  gives  the  following  ordered 
set  of  models; 

SMC  (5)  <  SMC  (4)  <  SMC  (I), 

Proof,  Combining  states  1  through  I,  in  SMC (3),  into  state  c* 
as  is  done  with  SMC (3),  we  have 

Oc»t,  =  O12O23 - Qib 

=  (e/2)i 

Similarly, 

^  /s  A  A  A  A 

Qc’o  “  Qi0’*^12Q20"’"*  •  •+Q12Q23' - ’Qlo 


Secondly,  combining  states  0  and  c’  into  the  new  state  c  is 
similarly  accomplished  and  yields 
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Qoe*^c^a 


Qca  = 


^"^Oc'^c’O 


The  corresponding  q^s  are  given  by 


^  _  (l-Pq)gl 

qcb - A — 


and 


„  -  (l-eq)(l-3^)-gq(l-(gq)^) 

qca  -  A 


where 


A  =  p  +  6q(3q)^* 


Once  again,  the  constant  part  of  the  functional  W(t)  is  given  by 

,  INb(t)  Pab(t)  I 


4.0  DSI  AND  OTHER  FUNCTIONALS > 

4.1  Introduction.  The  TFI  functional  makes  a  distinction  between  the 
two  plans  treated  in  Chapter  3  in  terms  of  the  ^’pseudophase"  transi¬ 
tional  probabilities.  However,  because  of  its  very  definition,  TFI  does 
not  explicitly  take  account  of  multiple  inspections  of  a  given  production 
unit.  That  is,  TFI  is  defined  in  terms  of  an  operational  time  which  is 
measured  by  a  flow  of  successive  and  nonrepeating  production  units.  In 
this  chapter,  a  new  functional,  along  with  a  variation,  is  introduced  to 
augment  TFI  as  a  measure  of  plan  performance.  The  functional  is  Fraction 
of  Repetitions  (FR) .  It  will  be  analyzed  only  for  the  first  type  of  plan 
(CSP-12) ,  Furthermore,  FR  is  chosen  as  the  principal  fimctional  because 
1.)  it  is  naturally  normalized  and  2.)  its  long  run  moments  can  be 
naturally  derived  from  those  of  the  transient  case  with  a  certain  amount 
of  ease.  Short  run  higher  moments  for  its  variant  cannot  be  obtained  so 
readily;  indeed,  appeal  must  be  made  to  the  Strong  Ergodic  Theorem  (or 
Renewal  Theorem)  for  even  the  long  run  (expected)  value. 
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4.2  SMC (9)  and  FR(N;2).  The  model  which  will  be  used,  SMC (9),  is  a 
modification  of  SMC (2)  and  is  portrayed  in  Figure  5, 

Figure  5 

CSP^12  and  SMC (9) 


The  transitional  matrix  of  the  embedded  MC  is 


a  0  0’  b 


where  6  *  6(l-q^)  and  r  =  l-6q^. 

The  matrix  entries  are  obtained  from  the  transformed  pdf’s  given 
Theorem  16.  SMC (9)  is  an  irreducible  SMC 
Proof.  The  transformed  pdf’s  are 
Qao  =  q^(z-q)/((>(z) 

Qoa  =  *5/z,  Qoo'  =  6ql/z,  and  Qob  =  g/z 
Qo'a=  S/Cz-dq^)  and  Qo’b  =  $/(z-6q^) 

Qba  “  <S/(z-B)  and  Q^,o»  =  6qI/(z-3). 


The  mean  holding  times,  obtained  from  the  derivatives  of  the 
transformed  pdf’s,  are 


Va  =  ^  ,  VO  =  1,  VO’  =  ,  and  Ub  =  6  • 


Using  the  matrix  given  after  Figure  5  to  solve  the  usual  eigen  value 
equation,  for  the  stationary  vector  e,  yields  the  system  o  equ 
given  below. 


6eo  +  J  eo’  + 


ea  - 

6q^eo  + 

Bcq  +  f  egt  =  % 


(where  r  =  l-6q^) . 


Solving  the  system  gives 


e 


0’ 


.  e„  . 

1-qI 


Again  we  use  the  fact  that 
to  one.  Using  the  equation 
last  three,  gives 


the  components  of  the 
which  expresses  this 


stationary  vector  add 
fact,  together  with  the 


©a  =  eo 

=  (l-q^)/G 
eg  I  =  q^(l-6q^)/G 
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and 


eij  =  B/G 

where 

G  =  (1-qI)  +  (l-6ql)  +3. 

We  finish  the  proof  by  translating,  into  English  text,  what  the 
transitions  mean  in  SMC(9);  we  will  write  "state  x  goes  to  state  y"  as 
"x  to  y".  0  to  b  if  no  defect,  0  to  0’  if  a  defect  is  found  but  DSI 

finds  none,  0  to  a  if  a  defect  is  found  and  DSI  finds  one  or  more,  0’  to 
a  if  a  defect  is  found  and  DSI  finds  one  or  more,  0’  to  b  if  unit  is  either 
not  inspected  or  is,  and  found  non-defective,  and  0’  to  0’  (remaining  in  O’) 
if  a  defect  is  found  but  DSI  finds  no  defects*  The  transition  0’  to  0’  is 
"internal"  -  that  is,  0’  has  no  self  transitions  and  is  consequently  a  non¬ 
trivial  SMC  state  (see  its  pdf  above  and  Chapter  1,  section  5). 

We  are  now  ready  to  define  the  principal  functional  in 

Definition  6.  Given  the  model  SMC  (9)  for  CSP-12,  the  functional 
Fraction  of  Repetitions  is  ^ 


The  definition  of  FR(t)  is  motivated  by  the  comments  made  at  the  end  of 
the  proof  to  Theorem  16.  In  addition,  we  remark  that  minus  one  appears 
since  the  inspection  process  begins  in  state  a  and  the  summation  appears 
for  0*  since  self  transitions  are  not  allowed.  For  infinite  t,  FR  has  the 
value  given  in 

Theorem  17. 


Lim  FR(t) 


1 

(l-ql)vi'j  +  v'2 


[a.e.] 


where  y'l  and  ii'2  are  defined  in  Theorem  1. 
Proof.  From  Definition  6, 
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Lim  FR(t)  = 

t-xx> 


Lim 

t->oo 


+  oini ,  [a.e.  ] 


by  the  Strong  Ergodic  Theorem.  From  Theorem  16  and  A. 25,  we  have 


^a  _  ■ _ (l^q^) _ 

^a  ]i\  (l-q^)+(l-ql)+ql+  6/ 6 


(l-ql)+y^ 


and 


0’ 


y’l(l-ql)+  y’2 


Adding  the  two  expressions  finishes  the  proof. 

Since  (I)'(tFR(t))  can  be  regarded  as  the  degree  of  inspection 
overlap,  we  are  led  to  define  a  variant  of  FR(t)  in 

Definition  7. 


FR’(t) 


_ t _ 

I(tFR(t))  +  t  • 


Concerning  this  functional,  we  have 
Theorem  18. 


Lim  FR’ (t) 
t^ 


(l-q^)y*,+y'2 

I  +  (l-ql)yY  +  y»^ 


[a.e.  ] 
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Proof,  From  Definition  7, 

Lim  FR’(t)  =  (I)(FR(t))+l 


by  Theorem  17, 

=  the  result. 


4.3  Expansions  and  Extensions.  Another  possible  treatment  of  DSI  is 
the  expansion  of  MRP  (and  SMC)  models  to  ’^transition  state”  models.  We 
will  work  here  only  with  MRP’s. 

Given  a  MRP  (Y,  U)  as  in  A. 19,  we  can  easily  prove  that 
P[Tn  =  =  i  and  =  j]  =  (Dl) 

where  Tj^  =  Un  -  From  A.  19  and  Eq.  Dl,  we  can  also  easily  show  that 

{((Yj^,  Yjj-i-i)  ,  Ui^)/n  varies  over  the  nat’l  nos.[  (D2) 


is  a  (derived)  MRP  whose  pdf’s  are  given  by 

Yjj+i)  =  Tn  =  t|(Y„_i,  Yq)  =  (l,j)] 


Qii (t) 
^±i 


(D3) 


We  name  the  MRP  given  by  expression  D2  and  simplify  notation  in 

Definition  8.  The  MRP  given  by  D2  is  called  the  Expanded  MRP.  Its 
pdf’s  are  given  by  Eq.  D3  and  denoted  by 
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Q(ij)(«-k) 


Such  a  derived  process  can  automatically  keep  track  of  transitions, 
their  number  and  type,  in  the  parent  process.  Thus, 

FR(»;2)  could  be  defined  (and  evaluated)  on  expanded  MRP(2)  as  given 
below. 


Theorem  19.  Expanded  MRP (2)  is  a  MRP. 

Proof.  From  Definition  8  and  Theorem  2,  the  transformed  pdf’s 
are  (dropping  the  argument) 


Q(12)(22)  =  q^Ql2 

Q(i2)(2i)  =  (i-q’-)Qi2 

Q(22)(22)  *  Q22 

^  (l-q^)Q22 

Q(22)(21)  - 


Q(21)(12)  ■ 


Letting  z  =  1  in  the  above  equations,  we  get  the  transitonal  matrix 
of  the  embedded  MC 


(12)  (22)  (21) 

(12) 

0  q^  1-q^ 

(22) 

0  q^  1-q^ 

(21) 

10  0 

Using  the  matrix  to  solve  for  the  components  of  the  stationary 
vector  gives 
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e(i2)  =  (l-qI)/G,  6(22)  =  qVc,  and  e(2i)  =  6(12) 

where 

G  =  2-qI. 

Defining  as  the  mean  holding  time  till  transition  to  state  i 
from  state  1,  using  Definition  8,  and  using  the  mean  value  property 
of  the  transformed  pdf’s,  we  get  h  p  y 


U(ij)  =  (Uij  I  qjk)/qij 


=  Pij/qij 


(19.1) 


Applying  Eq.  19.1  to  the  transformed  pdf’s  yields 

1^(12)  =  J^'l/qi2  ^*(22)  =  h22/q22  1^(21)  =  l^2l/q21 


=  p’l/1 


=  P22/q^ 


=  y2l/(l-q^) 


p’l 


P'2 


=  W'2 


where  pi,  y'2,  and  the  transitional  probabilities  are  defined  (or 
derived  from)  Theorem  1 . 

Definition  9.  For  Expanded  (MRP (2)), 


FE(t;2)  -  (I)  j?(a2)(t)  ( 


Theorem  20.  For  FR(t;2)  in  Definition  9, 

1 


Lim  FR(t;2)  =  - - 

(l'-ql)ji’l  +  y'2 


[a.e. ] 
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^  _1 _ 

E[T] 

where  E[T]  is  given  in  Proposition  1. 

Proof.  Theorem  19  and  Definition  9. 

We  close  this  chapter  by  showing  that  SMC (9)  cannot  be  collapsed 
into  any  of  the  other  models  for  CSP— 12 .  Any  collapsing  would  require 
that  the  ordered  ensemble  S  =  (0,0')  be  a  macrostate  as  defined  in 
Chapter  1.  However,  entrance  from  state  a  or  b  would  require  the  pev 
to  be  (1,0)  or  (0,1),  respectively.  If^we  picked  the  former  pev  and 
formally  defined  QbS  to  be  the  same  as  Qbo'»  the  Backward  Equation  system, 
for  SMC(9)',  say,  would  not  hold.  For  example,  if  S  were  a  macrostate, 
then,  letting  S  =  d,  the  equations 

Pab<t)  =  Qad*Pdb(*^) 

and 

Pbb(t)  =  Qbd*Pdb(<^)  +  Qba*Pab(t)  +  Jb(t) 

i 

would  have  to  hold.  However,  entrance  to  d  from  state  a  results  in  a 
greater  probability  for  a  given  holding  time  in  d  than  an  entrance  from 
state  b.  Consequently,  Pdb(t)  is  not  well  defined. 

Another  way  of  stating  this  inconsistency  is  provided  by 

Definition  10.  Let  P^y(t;w)  be  the  Fundamental  Probability  Function, 
from  X  to  y,  given  that  entrance  into  x  is  from  w. 

Then  consistency  requires  that  Pjjy(t;w)  be  independent  of  state  w. 
However,  fot  SMC(9)’, 

Pdb(t;b)  =»^Pdb(t;a) 

Similar  results  are  obtained  if  we  pick  (0,1)  as  the  pev  and  define 
Qgd  formally. 

Under  certain  conditions,  we  can  still  reduce  a  MC  to  a  SMC  in  the 
case  that  the  relevant  probability  functions  are  Indexed  by  ensembles  of 
MC  states  as  occurs  in  SMC(9) ' .  The  dependence  of  the  probability  functions 
on  the  entrance  ensemble  Is  equivalent  to  the  dependency  of  the  pev's.  We 
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therefore  drop  the  restriction  of  pev  independence  by  using  v(x;y) 
to  devote  the  pev  of  the  ensemble  x  given  an  entrance  from  y.  Further¬ 
more,  since  vj(x;y)  being  zero,  for  a  given  MC  state  j,  can  imply  that 
j  cannot  be  reached  from  any  other  states  in  x,  x  itself  becomes  a 
function  of  y;  x  =  x(y).  Further  dependence  is  handled  by  dropping 
the  inner  parenthesis:  for  example,  x(y(w))  =  x(yw).  Letting  a,  b,  c, 
d,  ...  be  (disjoint)  ensembles  of  MC  states  which  we  wish  to  transform 
into  macrostates,  we  make  a  provisional  definition  for  the  holding  time 
pdf’s  in 

Definition.  Given  a,  b,  c,  and  v(a;c) 


Qab(t;c)  =  J  Vj(a;c)fJ^g 

where  j  varies  over  the  set  a  and  B  is  the  absorbing  "state”  corresponding 
to  b. 


Given  the  underlying  MC,  M(-)>  the  above  Definition  will  yield  a  SMC 
iff  (letting  “  M(Ujj)  ,  being  the  elapsed  time) 

P[Rn+l  b(ac*-*)|Rn  a(c...),  Rn-1  c(d--0,  •••Rq  “ 

=  P[Rn+i  in  b(a)|Rn  in  a(c)] 

=  P[(Rn,  En+l)  =  (a.b),  T^+i  =  t|(Rn-l,  Rr)  =  (c,  a)] 

where  Tn+1  =  Un+l^Uj^.  Thus  v(b,  a(c**-))  =  v(b;a)  and  depends  only 

on  Qab(  Therefore,  it  is  necessary  and  sufficient  to  require  that 

a(c)  include  all  the  states  of  a  which  communicate  with  the  states  of  all 
other  ensembles,  (for  all  a,  c)  since  v(b;a)  depends  only  on  the  one  step 
MC  transitional  probabilities.  In  particular,  it  is  sufficient  that  a(c)  - 
a,  for  all  sets  a  and  c. 

Under  the  above  necessary  and  sufficient  condition,  we  can  now 

write 


“  1  Qab(  ;c)*Pbd(  ;a)(t) 
b 
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From  another  point  of  view,  we  can  also  let  a(c)  denote  the  state 
(^>Qax(  ^  varying  over  the  exit  states.  Using  this  latter 

notation,  we  can  set 

Qab(t;c)  =  Qa(c),b(t). 

For  a  given  MC,  the  resultant  number  of  states  may  be  small  enough 
to  warrant  SMC  reduction,  in  the  above  case  of  dependent  pev’s,  if  the 
reduction  in  complexity  is  substantial  enough.  This  extended  SMC  re¬ 
duction  can  be  applied  to  SMC (9) ;  S (a)  =  the  ordered  set  (0,0’)  and  S(b)  - 
(O’).  However,  nothing  is  gained  here  since  we  still  have  4  states. 

In  closing  this  chapter,  we  point  out  yet  another  deviation  from  the 
conditions  of  a  state  independent,  stationary  pev.  The  deviant  condition 
can  be  found  in  [6.2,  Chp.  5].  The  type  of  pev  found  there  is  an  initial 
pev  used  in  the  arbitrary  entry  case  of  CSP’s.  It  is  shown  that  the 
existence  of  these  pev’s  is  equivalent  to  that  of  initial  (or  delayed) 
holding  time  pdf’s  in  the  stationary  (or  random  entry)  case  for  ergodic 
SMC’s.  Thus,  this  special  type  of  pev  is  handled  in  a  manner  analogous 
to  that  used  for  state  dependent  pev’s  -  as  an  "index”  (given,  in  the 
paper  cited,  by  a  prime  over  the  Q’s). 


5.0  CONCLUSION. 

5.1  Summary .  Two  approaches  to  the  DSI  modification  of  CSP— 11  are 
considered  in  Chapters  2  through  4.  The  first  approach,  found  in 
Chapters  2  and  3,  ignores  any  overlap  in  the  inspection  process  by  using 
a  functional,  defined  on  a  new  DSI  model,  to  count  only  the  additional 
units  which  are  inspected  from  sampling  phase  segments  -  units  which 
would  otherwise  not  be  inspected  under  CSP-11.  Since  the  functional  TFI 
is  not  sufficient  to  deal  with  all  the  important  aspects  of  CSP-12,  a 
second  approach,  found  in  Chapter  4,  uses  a  new  functional,  defined  on  a 
slightly  different  DSI  model,  to  take  account  of  inspection  overlaps.  In 
either  treatment,  there  is  no  explicit  backtracking  in  operational  time 
itself;  both  approaches  incorporate  the  time  shift  into  the  transitional 
changes,  induced  by  DSI,  which  are,  in  turn,  incorporated  in  the  pdf’s  of 
the  underlying  models.  Throughout  the  paper,  variations  in  functionals  and 
sampling  plans,  together  with  comparisons  of  them  with  the  primary  objects 
of  study  are  also  considered. 

5.2  Methods  Used.  Two  principal  tools  are  used  in  the  analysis  of  DSI: ^ 
SMC  (and  MRP)  reduction  and  the  z-transform.  Since  the  SMC  s  constructed 
for  the  analysis  are  modifications  of  the  SMC  model  of  CSP-11,  the  process 
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of  constructing  a  SMC  class  from  a  MC  model,  described  in  Chapter  1, 
is  turned  around.  In  Chapter  4,  the  importance  of  the  probability 
entrance  vector  (pev)  is  brought  out  by  the  incompatibility  of  SMC(9) 
with  the  other  CSP-12  models.  Also  in  Chapter  4,  the  use  of  an 
Expanded  MRP  in  the  analysis  of  DSI  is  illustrated;  this  kind  of 
analysis  could  be  elaborated  on  for  further  investigation  of  functionals 
dependent  on  a  sequence  of  transitions. 

We  conclude  this  paper  with  the  observation  that  DSI  can  be  used 
to  modify  the  more  complex  CSP's  described  in  Chapter  1. 
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APPENDIX 


A  0  SEMI  MARKOV  CHAINS.  Given  that  X(')  is  a  time  homogeneous ,  aperiodic , 
irreducible  or  absorbing,  and  finite  state  Semi  Markov  Chain  (SMC)  with 
state  space  S,  the  following  notation  and  statements  are  used  in  the  body 
of  the  text  [cf.,  6.7,  6.10,  6.14,  and  6.15]. 

A.l  Notation  and  Definitions.  For  i,  j  ,  k,  9^  in  S. 

1.  Qij,(t)  =  P[X(t)=k,  X(t')=i,  0  <  t’  <  tlx(0)=i].* 

This  function  is  the  (defective)  pdf  of  the  time  of  sojourn  in  state  i 
until  a  transition  is  made  to  state  k  (for  discrete  t  and  i  ) . 

2.  Pik(t)  =  P[X(t)=klx(0)=i]. 

This  function  is  the  fundamental  probability  function  of  the  SMC 
for  (i  to  k) . 

3.  Fijj(t)  =  P[X(t)=k;  X(t')  =?'=k,  0  <  t'  <  t|x(0)=i]. 

This  function  is  the  first  entrance  probability  function  for  (i  to  k) . 

4.  Jk(t)  =  Ho*(6o-I 

This  function  is  the  probability  of  not  leaving  state  k  by  time  t. 

5.  Un(k)  is  the  time  of  nth  entry  into  k. 

6.  Nk(t)  =  Max  |n/Un(k)  S  t} 

This  random  variable  is  the  renewal  function  for  state  k. 

7.  Un  is  the  time  of  n<^h  entry. 

8.  Y(n)  =  X(Ujj)  is  the  embedded  Markov  Chain  associated  with  the  SMC. 


*This  definition  corrects  statement  3,  definition  5  in  [6.2,  p.  664] 


472 


For  the  case  where  self  transitions  are  allowed,  we  can  use  the 
symbols  above  to  define  a  Markov  Renewal  Process  (MRP) , 

9.  A  MRP  is  the  ordered  pair  (Y,  U)  such  that,  for  states  i,  k  in  S, 

P[Yn=k,  Tn=t|Yn_x=i,  Yn_2,  •••,  Yq;  •••,  Tq] 

=  P[Yn=k,  Tn=t|Yn-i=i],  Tn  =  Un-Un-i 


=  Qik(t). 

(Note  that  this  pdf  is,  in  general,  different  from  that  defined  in  A. 11.) 
10.  The  SMC  X(t)  associated  with  a  MRP  is  defined  by 
X(t)=Y(t) 

“  %(t) 

where  N(t)  =  ^  Nj (t) ,  j  in  S. 

3 


A. 2  Statements. 

1.  By  time  homogeneity  and  the  method  of  first  entrance,  we  have  the 
Backward  Equations; 

2.  Pik(t)  =  Fik*Pkk(t)  +  («ik)Jk<t). 

3.  If  =  Ho*Qik(+ 


T  ~  t'lik^ 

is  the  transitional  matrix  of  Y. 

*This  definition  corrects  that  given  in  [6.2,  p.  695]. 
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4.  If  X  is  irreducible,  the  equation 

has  a  unique  normalized  solution  called  the  stationary  vector  of 
the  SMC. 

euPk 

5.  Lim  PikCt)  =  - 

t-x» 

i. 


=  aj^  (or  Pk(”)) 

where  is  the  mean  time  of  so.journ  in  state  k  and  the  are  the 

components  of 

("r) 

7.  (Strong  Ergodic  Theorem.)  If  W  is  a  functional  defined  on  the  SMC, 
we  have,  as  N  approaches  infinity, 

^  I  W(X(s))  approaches  Eq^[W],  [a.e.] 


=  I  W(k)ak. 

k 

In  the  case  of  self  transitions,  we  have 

8.  If  (Y,  U)  is  a  MRP  such  that  q^^  <  1,  the  unique  SMC  induced  by  the  MRP 
has  its  pdf’s  given  via  (i  "5^  j) 


A 


if  qii  >  0 


A 

=  Qij ,  otherwise 


where  the  Q’s  are  given  by  A. 19.  It  is  equivalent,  almost  everywhere, 
to  the  associated  SMC. 

9.  The  properties  of  time  homogeneity,  irreducibility ,  and  aperiodicity 
are  preserved  under  filtration. 
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PROGRESSIVELY  CENSORED  SAMPLING  IN  THE 
THREE  PARAMETER  LOG-NORMAL  DISTRIBUTION* 


A.  Clifford  Cohen 

The  University  of  Georgia 
Athens,  Georgia 

SUMMARY 

This  paper  is  an  extension  of  previous  work  by  the  writer  con- 
cejming  progressively  censored  sampling  in  the  normal  distribution  [4] 
and  in  the  Weibull  distribution  [6].  Here  local  maximum  likelihood 
estimators  and  estimators  which  utilize  the  first  order  statistic  are 
derived  for  the  three-parameter  log-normal  distribution  when  samples 
are  progressively  censored.  An  illustrative  example  involving  life 
test  data  is  included.  Various  properties  of  the  proposed  estimators 
are  investigated. 

KEY  WORDS 

Log-normal  Distribution 
Progressively  Censored  Samples 
Life  Testing 

1 .  INTRODUCTION 

Progressively  censored  samples  frequently  occur  in  life  and  fa¬ 
tigue  tests,  where  individual  observations  are  time  ordered  and  where 
at  various  times  during  a  test,  some  of  the  survivors  are  removed 
(i.e.  censored)  from  further  observation.  Samples  of  this  type  from 
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the  normal  and  from  the  exponential  distribution  have  received  previous 
attention  from  Herd  [10],  Roberts  [18],  and  the  writer  [4].  Progressively 
censored  samples  from  the  two-parameter  Weibull  distribution  were  con¬ 
sidered  by  the  writer  [5]  and  by  Ringer  and  Sprinkle  [17].  More  recent 
work  by  the  writer  [6]  deals  with  progressive  censoring  in  the  three- 
parameter  Weibull  distribution.  The  present  paper  is  concerned  with 
progressive  censoring  in  the  three-parameter  log-normal  distribution. 

2.  THE  SAMPLE 

Let  N  designate  the  total  sample  size,  and  n  the  number  which  fail 
and  therefore  result  in  completely  determined  life  spans.  Suppose  that 
censoring  occurs  in  k  stages  at  times  T^>T^_j,  j=l,  2,  ...,  k,  and  that 
r.  surviving  items  are  removed  (censored)  from  further  observation  at 
the  jth  stage.  Thus 

N  =  n  +  Ej  r^ .  (1) 

Two  types  of  censoring  are  generally  recognized.  In  Type  I  censoring, 
which  is  of  primary  interest  here,  the  T^  are  fixed,  and  the  number  of 
survivors  at  these  times  are  random  variables.  In  Type  II  censoring, 
the  number  of  survivors  are  fixed  and  the  T^  are  random  variables.  In 
both  types,  the  r^  are  either  fixed  or  determined  independently  of  the 
life  span  X.  The  observations  x^  are  ordered  according  to  magnitude. 

The  likelihood  function  L(S) ,  where  S  signifies  a  k-stage  Type  I 

progressively  censored  sajnple  of  the  type  described,  is 
n  k  r. 

L(S)  =  C  n  f(x,)  n  [1  -  F(T  )]  \  (2) 

i=l  ^  j=l  ^ 

in  which  C  is  a  constant  while  f(x)  and  F(x)  are  density  and  distribu¬ 
tion  functions  respectively. 
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3.  THE  LOG-NORMAL  DISTRIBUTION 

We  write  the  density  function  for  the  three-parameter  log-normal 
distribution  as 


f{x;  V,  a,  y)  = 


(x-y) 


^-[ln(x-Y)-M]^/2a2^^  <  x  <  », 


=0,  elsewhere. 

This  distribution  derives  its  name  from  the  fact  that  when  the  random 

2  2 
variable  X  is  lognormal  (y,  a  ,  y)  ^  ^l^^n  Y  =  In(X-Y)  is  normal  c  ). 

The  mean,  median,  mode,  variance,  coefficient  of  variation,  3^  and  ^2 

(Pearson’s  Betas)  for  this  distribution  (c.f.  Yuan  [23])  are 


u  =  Y  + 

Me  =  Y  + 

Mo  =  Y  + 

V(x)  =  e^^  (jo(to-l),  (4) 

CV  =  ^TT, 

=  ttj  =  ((0+2)^  (a)-l), 

4  3  2 

^2  “  ^4  “  ^  ^ 


where 

0)  =  e^  ,  (5) 

and  where  and  denote  the  third  and  fourth  standard  moments. 

The  coefficient  of  variation  about  the  left  terminus  is  defined  as 

CV  =  /Cp^-y).  (6) 

Previous  investigations  by  the  writer  [3],  Aitchison  and  Brown  [1], 

Hill  [11],  Wilson  and  Worcester  [2l],  and  others  have  dealt  with  maximum 

likelihood  estimation  in  the  three  parameter  log-normal  distribution  when 
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samples  are  complete.  Harter  and  Moore  [9]  considered  local  maximum 
likelihood  estimation  in  the  three  parameter  log-normal  distribution 
for  singly  and  doubly  censored  as  well  as  for  complete  samples.  Hill 
examined  some  unusual  features  of  the  likelihood  function  of  this  dis¬ 
tribution  which  had  apparently  escaped  the  notice  of  earlier  investi¬ 
gators.  He  demonstrated  the  existence  of  paths  along  which  the  like¬ 
lihood  function  of  any  ordered  sample  x^,  ...,  x^  tends  to  “  as 
2 

(y,  y,  a  )  approach  (x^,  -<»,  <»). 

This  global  maximum  of  the  likelihood  function  thereby  leads  to 

A  ^  /V2 

the  inadmissible  estimators,  y  =  y  =  and  a  =  «>  regardless  of 
the  sample.  On  the  other  hand,  when  we  equate  partial  derivatives  of 
the  log-likelihood  function  to  zero,  solution  of  these  equations  leads 
to  local  maximum  likelihood  estimates  which  in  most  cases  are  reason¬ 
able  and  as  noted  by  Harter  and  Moore  (loc.  cit.)  appear  to  possess 
most  of  the  desirable  properties  ordinarily  associated  with  maximum 
likelihood  estimators.  Exceptions  may  occur  in  small  samples  for  which 
the  likelihood  function  fails  to  exhibit  a  clearly  defined  local 
maximum. 


4.  LOCAL  MAXIMUM  LIKELIHOOD  ESTIMATION 
With  the  p.d.f.  as  given  in  equation  (3),  the  logarithm  of  the 
likelihood  function  (2)  becomes 


InL  =  -nlnCT  -  Z:”ln(x.-Y)  -  E"[ln(x. -y)-v] 
^  ^  2a  ^  ^ 


+  ln[l-F.]  +  In  C. 

1  J  ^ 


C7) 
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Local  maximum  likelihood  estimators (LMLE)  are  obtained  by  simul¬ 
taneously  solving  the  estimating  equations 

=  IE  .  1_  z5;[l„(x.-y)-p]2  -  Ej(i)  =  0.  (8) 

10  1  j  ' 

'  "Kj)  =  riFctp-  ■ 


where 


F^.  =  F(T^)  =  j'^^f(x)dx  =y'^3 


g(y)dy 


(|)(z)dz  = 


F(e.),  (10) 

2 

in  which  f(x)  is  given  by  (3),  g(y)  is  the  normal  density  (y,  0  ),  <|)(z) 
is  the  standard  normal  density  (0,1),  and 


y  =  In(T.-Y),  whereas  ?.  =  (y.-u)/0. 

J  J  J  J 


It  then  follows  from  (9), (10)  and  (11)  that 


<r:FT>  air'  'h'  h-Fj’  rr"  '^j^y  3  t 


-  -  Tj-v  • 

When  the  results  of  (12)  are  substituted  into  (8),  the  estimating 


equations  become 
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i:J[ln(x.-Y)-p]  +  asJr.Z. 


E"[ln(x^-Y)-li]^  +  a^[i:^r^C^Z^-n] 


ln(x.-Y)-y  o  1  T,Z. 

■  o. 


Various  iterative  techniques  are  available  for  simultaneously 

A  A  A 

solving  these  three  equations  for  the  required  estimates  y,  a,  and  y, 

A  procedure  that  has  performed  quite  well  for  the  writer  involves 

selecting  a  trial  value  for  y,  solving  the  first  two  equations  with 

Y=Y^  for  vi^  and  using  the  standard  Newton  technique  (c.f.  page  90 
of  reference  [20]),  and  then  substituting  these  values  into  the  third 

equation  of  (13),  Once  two  values  y*^  and  Yj  have  been  found  such  that 

the  absolute  difference  |y^-Y.|  is  sufficiently  small  and  such  that 
HCYi>yi,cTi)  >  0  >  H(Yj,Pj,a^),  where  H(Y,y,cr)  designates  the  left  side 
of  the  third  equation  of  (13),  the  required  estimates  follow  by  linear 
interpolation.  The  smallest  sample  observation,  is  of  course  an 
upper  bound  on  y  and  may  thus  be  employed  as  a  first  approximation  Yj 
in  the  iteration  procedure. 

In  the  event  that  the  third  estimating  equation  of  (13)  is  not 
satisfied  for  any  value  of  y  in  the  permissible  interval  y  f.  then 
the  modified  estimatoisof  Section  5  are  to  be  recommended. 


Harter  and  Moore  encountered  the  related  problem  in  connection 
with  samples  that  are  singly  and  doubly  censored.  With  r  observations 
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censored  on  the  left  so  that  is  an  upper  bound  on  y,  their  recom¬ 
mendation  is  that  an  additional  observation  be  censored  on  the  left  so 
that  then  becomes  a  new  upper  bound  on  y> 


5.  MODIFIED  MAXIMUM  LIKELIHOOD  ESTIMATION 
Alternate  estimators  (MMLE)  which  have  proven  most  satisfactory 
in  numerous  applications,  can  be  obtained  by  simultaneously  solving  the 
estimating  equations 


|-J2_k  =  0,  |-^=  0,  and  E[F(xp]  =  FCxp, 

where  is  the  rth  order  statistic  in  a  sample  of  size  N.  Only  those 
failures  which  occur  prior  to  the  time  at  which  the  first  stage  of 
censoring  takes  place,  provide  observed  values  for  order  statistics, 
and  thus  the  maximum  value  of  r  is  limited*  In  most  applications,  we 
might  choose  to  set  r=l,  but  a  larger  value  might  be  preferred  if  there 
is  reason  to  suspect  contamination  of  the  sample  data  in  the  vicinity 
of  the  terminus*  Applicable  estimating  equations  accordingly  consist 
of  the  first  two  equations  of  (13)  plus  a  third  equation  involving 
as  derived  below.  Since 

F(xp  =  y  ’^fCxDdx,  and  since  E[F(x^)]  =  ,  (14) 

Y 

it  follows  that  our  third  estimating  equation  becomes 


(15) 


(16) 
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The  modified  estimators  accordingly  are  found  by  simultaneously 

solving  the  set  of  equations  consisting  of  the  first  two  equations  of 

(13)  plus  equation  (15).  The  same  procedure  employed  in  Section  4  to 

calculate  the  LMLE  is  also  applicable  here.  On  determining  and  y^ 

small  and  such  that  G(y.,iJ.,a.)  > 

^  ^  ^ 

X  >  G(y.,y.,a.),  where  G(y,]j,a)  =  y  +  e  ,  we  interpolate  for  the 
^  1  1  3 

required  estimates  just  as  we  did  in  Section  4. 

6.  SOME  SPECIAL  CASES 

Various  special  cases  in  which  at  least  one  of  the  parameters  is 
known,  are  of  interest  in  certain  applications.  The  following  are 
considered  to  be  deserving  of  mention  at  this  time. 

MLE  with  y  kno\m. 

With  y  known,  there  is  no  longer  any  distinction  to  be  made 
between  a  local  maximum  and  a  global  maximum.  The  applicable  estimating 
equations  in  this  case  are  the  first  two  equations  of  (13),  and  they  may 

^  yv 

be  solved  iteratively  for  the  required  estimates  y  and  o  as  outlined 
in  Section  4.  As  an  alternate  technique,  we  might  make  the  transformation 
y.  =  In(x^-y)  and  then  proceed  as  described  in  reference  [4]  for  a 
progressively  censored  sample  from  a  normal  distribution.  Gajjar  and 
Khatri  [7]  previously  considered  this  special  case. 

LMLE  with  a  known. 

It  often  happens  that  the  shape  parameter  a  and  thus  are  known, 
leaving  only  y  and  y  to  be  estimated  from  the  sample  data.  In  this  case, 
the  applicable  estimating  equations  consist  of  the  first  and  third 
equations  of  (13).  • 


such  that  is  sufficiently 
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MMLE  with  a  known. 


In  this  case,  the  applicable  estimating  equations  consist  of  the 
first  equation  of  (13)  plus  equation  (15). 


7.  ESTIMATE  VARIANCES  AND  COVARIANCES 
The  asymptotic  variance-covariance  matrix  of  the  estimators  y,  a, 

A 

and  Y  is  obtained  by  inverting  the  information  matrix  in  which  elements 
are  negatives  of  expected  values  of  the  second  partial  derivatives  of 
the  logarithm  of  the  likelihood  function.  For  sufficiently  large 
samples,  these  expected  values  can  be  approximated  by  substituting  the 
estimates  obtained  from  a  given  sample  directly  into  the  partial  deri¬ 
vatives  which  are  given  below. 


8  In  L  -^n  1  ^  i  ri  r 

- 2 -  “2  “  “T  ^1 

ay  a  a  j  j  j  j 


2  r  .^2  ,  1  _  n  1  1, 


3^  In  L  _  3^  In  L  _  1  ^n^  1  ^  1 

3u9y  ■  9y3p  ■  2  I'-x^-Y''  ‘  ^2  1  (T^-y) 


3  In  L  3 


^InL  2  .n  1  .k  (Z.-g  ■)] 


1  (x.-Y) 


(T.-Y. 

J 


3^  In  L  3^  In  L  2  ^n 


1  „k 


Since  the  estimators  y,  a  and  y  are  local  rather  than  global 


maximum  likelihood  estimators,  the  applicability  of  the  variance -covariance 
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matrix  obtained  here,  might  be  open  to  question.  However,  a  Monte 
Carlo  study  by  Nicholas  Norgaard  [16]  indicates  that  the  approximate 
asymptotic  variances  and  covariances  obtained  here  should  be  considered 
satisfactory  when  n  ^  50  ,  although  they  might  be  misleading  as  measures 
of  sampling  error  for  small  samples.  Norgaard’s  results  are  consistent 
with  results  of  an  earlier  Monte  Carlo  study  by  Harter  and  Moore  (loc. 
cit.)  in  connection  with  singly  and  doubly  censored  samples.  It  is 
also  to  be  noted  that  Norgaard' s  study  indicates  that  variances  and 
covariances  of  the  MMLE  are  approximately  equal  to  corresponding 
measures  of  the  MLE.  This  is  an  area  of  investigation  that  is  continuing 
to  receive  attention  both  from  Norgaard  and  the  writer. 


8.  AN  ILLUSTRATIVE  EXAMPLE 

A  simulated  life  test  was  conducted  on  100  randomly  selected 


units  of  a  certain  electronic  device  having  a  log-normal  life  span 
with  11  =  5.0000,  a  =  0.3000  and  y  =  100.  Sixty-five  complete  life 
spans  were  observed,  while  thirty- five  observations  were  censored  in 
three  separate  stages.  Following  are  the  life  spans  in  hours  to  two 


places  of 

decimal , 

for  the  65 

units  which 

failed  during  the  test. 

167.91 

200.88 

219.14 

232.91 

246.61 

262.59 

287.71 

175.83 

201.76 

220.59 

235.66 

247.17 

263.94 

288.81 

185.88 

205.31 

222.00 

236.75 

249.14 

266.12 

291.30 

188.14 

206.98 

222.82 

237.40 

249.73 

266.62 

295.18 

189.08 

210.78 

224.33 

239.05 

250.09 

267.01 

297.38 

191.96 

212.49 

225.60 

240.22 

252.89 

270.64 

195.61 

213.24 

226.50 

240.64 

253.57 

271.76 

197.01 

215.25 

227.24 

242.17 

255.57 

275.48 

198.76 

216.75 

227.24 

243.03 

260.60 

279.62 

199.05 

218.78 

231.42 

244.56 

261.99 

285.19 
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When  the  tenth  failure  occured  at  time  =  199.05,  twelve  units 

selected  at  random  from  the  survivors  were  censored  (i.e.  removed  from 

the  test).  When  the  forty-fifth  failure  occured  at  time  T2  =  250.09, 

ten  additional  randomly  selected  survivors  were  removed,  and  the  test 

was  terminated  at  time  =  297.38  with  13  survivors.  In  summarizing 

3 

these  data,  we  record:  N  =  100,  n  =  65,  Z^r^  =  35,  =  167.91,  Tj^  = 

199.05,  r^  =  12,  T2  =  250.09,  r2  =10,  =  297.38,  r^  =13,  Z^^x^  = 

15,327.43,  X  =235.8066. 
oo 

Estimates  were  calculated  as  described  in  Sections 4,  5  and  6 
and  are  summarized  in  the  following  table. 

In  general,  the  estimates  obtained  here  compare  favorably  with 
corresponding  population  parameters. 
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TABLE  1  -  SUMMARY  OF  ESTIMATES 
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